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Foreword 


In a seminal paper published in the early 1970s, Manfred Eigen introduced a simple 
system of differential equations describing the evolution of an infinite population 
of macromolecules undergoing chemical reactions. The trajectories of this system 
converge to a unique equilibrium, called the “quasispecies” equilibrium. 

The book by Raphaél Cerf and Joseba Dalmau revisits in depth the notion of 
quasispecies and demonstrates its remarkable universality in population dynamics, 
far beyond the original problem considered by Eigen. It explains how and why the 
quasispecies equilibrium can describe the long-term behavior of most classical mod- 
els in population dynamics, whether they are deterministic or stochastic, including 
Moran—Kingman, multitype Galton—Watson, Wright—Fisher, continuous branching 
and Moran models. The common unifying thread is the fact that the quasispecies 
equilibrium is the normalized Perron—Frobenius eigenvector of the natural matrix 
encoding the fitness and mutation probabilities of the macromolecules. 

The authors have recently made significant contributions to our understanding of 
finite population models, in the regime where genotypes are large (compared to the 
size of the population) and mutations are small. The book offers a beautiful synthesis 
of these works, the most salient features of which are: several explicit formulas 
describing the quasispecies distribution and their links with classical combinatorial 
identities, the phase transition separating a quasispecies regime from a disordered 
regime and a full proof for the Wright—Fisher model. 

With the exception of a few classical results, all the results of the first four parts are 
rigorously demonstrated. The proofs are elegant, powerful and always accessible. The 
subsequent parts present some conjectures and more technical results, guiding the 
reader to further open questions. The text is a pleasure to read and can easily be used 
in several courses both in probability and Markov chains, population dynamics or 
mathematical ecology. In addition, the specialist, whether she/he is a mathematician 
or a theoretical ecologist or biologist, will find powerful ideas here for investigating 
finite population models. 


University of Neuchatel, March 2022 Michel Benaim 
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Chapter 1 ® | 
Introduction ome 


We are surrounded by a huge heterogeneity of living beings: insects, plants, animals 
and humans. Even the creatures belonging to the same species present an extraordi- 
nary variability. And yet, what we can see with the naked eye, is but a tiny fraction of 
the realm of the living. Indeed, bacteria, viruses, prions and dozens of other microbes 
interact with us everyday without us even noticing. No matter how different, the one 
feature that we all share, is that we have all been shaped by the means of evolution: 
a careful equilibrium between mutation and selection. Mutations are responsible for 
having introduced all the changes in our genetic information, from a remote past 
until today, making us look as we do, while selection, caused by a combination of 
many internal as well as external forces, has preserved our lineage through history, 
by making it successful where many others have perished. 

The present text focuses on equilibrium, on the subtle balance between selection 
and mutation to which we owe the vast genetic heterogeneity in many of the living 
populations. Imagine a population evolving in an environment that selects certain 
genotypes over some others (selection meaning that the fittest genotypes produce, on 
average, more children than their less fit companions), and that mutations allow for 
the fit genotypes to become unfit and vice versa. On one hand, if selection is mild, 
and the mutation rate is very high, the most fit individuals will have no advantage, 
since their genotype will immediately mutate and become unfit. In fact, if we observe 
such a population, we will see how the different genotypes come and go, and all of 
them will eventually vanish to let new genotypes appear, thus never reaching an 
equilibrium. On the other hand, if selection is very strong, and the mutation rate is 
very low, the fittest genotype will take over the whole population, thus leaving no 
place for any variability in the population. 

We focus here on the intermediate situation, where selection and mutation com- 
pensate each other, and the population reaches an equilibrium. The main questions 
that we try to answer are: can this equilibrium be described as a function of the muta- 
tion and selection parameters? And if yes, to what extent is this description sensitive 
to the choice of the model? There exist several models describing the evolution of 
a population under mutation and selection, which encompass different features: the 
population size may be finite or infinite, constant or variable, the dynamics might 
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be deterministic or random, the different generations may or may not overlap, the 
mutations may happen during reproduction, or at any time of the life cycle. We 
consider throughout this text a series of models with these different features, and we 
study the equilibrium of the resulting mathematical processes. 

Our goal is to present a unified picture of the mutation selection equilibrium, 
which is valid for several classical models in an adequate regime. Our starting 
point is the quasispecies equation, a general non-linear equation that describes the 
mutation-selection equilibrium of all the different models considered throughout 
the text. This equation arises naturally in Eigen’s model [31], and it was Manfred 
Eigen and Peter Schuster who coined the term quasispecies to refer to the mutation- 
selection equilibrium of this model [33]. Notice that the quasispecies model of Eigen 
is dynamical in nature; what we call the quasispecies equation is the mere equilibrium 
equation which arises from Eigen’s model. 

There exists a huge literature on Eigen’s model and quasispecies theory. Our goal 
here is not to present a synthesis of all the works on the quasispecies subject. For 
readers who wish to learn more on the various aspects of quasispecies theory and 
its application to the study of viruses, we refer to the books [27, 28, 30] and to the 
review papers [7, 13, 29]. Let us try instead to explain the objectives of our text and 
its relationship with other works on quasispecies. 

Eigen’s model is defined through a set of differential equations which describe the 
evolution of an infinite population of macromolecules [31]. Within this framework, 
Eigen and Schuster studied a specific stylized landscape, called the single peak land- 
scape or sharp peak landscape, in which there is only one favored sequence, called 
the Master sequence [32]. Despite the simplicity of the model, they were able to 
derive very interesting results, namely the existence of an error threshold, and the 
formation of a specific population structure which they called a quasispecies. These 
concepts became widely used in biology to discuss the evolution of a population 
driven by mutation and selection, especially in the context of virus populations. 
Viruses being simpler organisms, and their mutation rates being very high, the con- 
cepts of quasispecies and error threshold are particularly relevant in understanding 
populations of viruses, as shows the extensive use of the term even in the more 
recent literature aimed at understanding the recent outbreak of the SARS-CoV-2 
virus [1, 40, 48, 49, 87]. However, there is a major theoretical obstacle to apply 
Eigen’s model to viruses, which was raised in [50]. Indeed, Eigen’s model was ini- 
tially formulated for an infinite population, whose genotype has finite length, whereas 
biological populations are finite with a typical population size much smaller than the 
number of possible genotypes. In addition to that, several finite population models of 
evolution have also been studied over the last decades, which incorporate stochastic 
effects, in the field of population genetics. The most classical ones are the Moran 
model, the Wright—Fisher model and the Galton—Watson process (see for instance 
the classical book [35]). This leads to a natural debate: is quasispecies theory suitable 
to describe the evolution of viruses, and what is its link with population genetics? 

This question was fully addressed by Wilke [86]. Wilke argued that there is 
no disagreement between the population genetics of haploid, asexually replicating 
organisms and quasispecies theory, and he demonstrated it for the model of evolution 
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of a single locus with two alleles, and for the mutational load studied by Kimura and 
Maruyama [56]. Furthermore, Wilke discussed whether quasispecies theory applies 
to finite populations. Several works aim at building a model for a finite size population 
which derives from Eigen’s model, by introducing an approximation scheme to the 
deterministic differential equations, which incorporates stochastic effects: Alves and 
Fontanari [2], Demetrius, Schuster and Sigmund [25], McCaskill [62], Gillespie 
[41], Weinberger [85], and more recently Musso [67] and Dixit, Srivastava, Vishnoi 
[26]. In a very influential paper, Nowak and Schuster [69] constructed a birth and 
death model to approximate Eigen’s model. Finite size corrections to Eigen’s model 
have also been computed with the help of complex methods from statistical physics 
[3, 58, 73, 75, 76]. Moreover, various computer simulations have demonstrated 
that the predictions of quasispecies theory can be observed for finite populations. 
Comas, Moya and Gonzalez-Candelas [19] studied how the population size affects 
the survival of the flattest. Ochoa [70] performed extensive simulations in the context 
of genetic algorithms. However these theoretical studies and these simulations deal 
with very stylized fitness landscapes and simple reproduction mechanisms, which 
are still very far from the complexity of real viruses. In the end, Wilke [86] concludes 
that there is nothing that could contradict the existence of the quasispecies effect in 
finite populations, and at the same time there is no true experimental evidence in its 
favor. The debate on the relevance of quasispecies concepts to the study of viruses 
is still ongoing [47, 72], and it seems to be quite open up to now. 

This text is essentially a mathematical development of the questions discussed 
in Wilke’s paper [86]. Namely, we wish to study further the mathematical links 
between Eigen’s model and classical finite population models. Over the past few 
years, we have been investigating this question by examining successively several 
models. In each of these models, we found out that a quasispecies can be formed 
in a suitable asymptotic regime of the parameters [15, 16, 17, 20, 22]. However 
each model required a different treatment and lengthy proofs. Naturally, we tried 
to unify these results and to understand the common features which lead to the 
formation of a quasispecies. Currently, it seems to us that the central object which 
can be recovered in each case is the quasispecies equation, namely the equilibrium 
equation associated to Eigen’s model. This is why this review text is centered on this 
quasispecies equation. 

This text has several goals. A first goal is to show how the quasispecies equation is 
naturally linked with classical population models. This is essentially a mathematical 
formalization of one of the key points exposed in Wilke’s paper [86]. A second goal 
is to study the quasispecies equation itself. Obviously, this equation is extremely 
complex and a rigorous mathematical analysis can be conducted only in specific 
cases. We discuss first the case of a finite genotype space, then we move on to the 
sharp peak landscape and finally to the case of class-dependent fitness landscapes. 
While studying the Moran model, we obtained an explicit formula for the quasispecies 
distribution on the sharp peak landscape [17], which appeared again for the Wright— 
Fisher model [20] and the Galton—Watson process [22]. We finally understood that 
this formula was in fact a solution of the limiting quasispecies equation, as we show 
in this text. A third goal is to show how the quasispecies and the error threshold 
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phenomenon emerge in finite population models (which was in fact the original 
motivation of the works [15, 16, 17, 20, 22]). We tried to streamline the different 
proofs of our previous works in order to present a more general robust approach to 
these apparently similar results. To this end, we relied on ideas coming from the 
theory of random perturbations of dynamical systems of Freidlin and Wentzell [38]. 

At the end of the day, one may wonder whether it is worth developing all these 
mathematical techniques to prove facts which were certainly well known by theoret- 
ical biologists. On one hand, the mathematics associated to these apparently simple 
models is rich and beautiful, and certainly deserves to be investigated, in order to 
give precise answers to the previous questions. For instance, a delicate point, which 
is also raised by Wilke [86], is to understand the dependence of the error threshold 
on the population size. The approach presented here shows that, for a quasispecies 
to be formed, the population size has to be at least of the same order as the genotype 
length. On the other hand, these investigations lead to the development of new simple 
formulas for the quasispecies distributions. Because of the simplicity of the models, 
these formulas will not help directly to check the validity of the models, yet we 
believe they constitute a small modest step in this direction. Indeed, deep sequencing 
techniques yield a huge amount of data on the genotype of the viruses, and we need 
theoretical models to explain the structure of these data. For instance, in the case 
of class-dependent fitness landscapes, we have obtained a formula which allows us 
to recover the fitness landscape if we are given the concentration of each Hamming 
class at equilibrium. Unfortunately, because of all the simplifying features that lead 
to it, this formula is unlikely to be realistic. Nevertheless, we hope that it is a good 
starting point for theoretical discussions, and that it will be extended in due time 
when operational models of real fitness landscapes will be available. 

Of course, the results presented here are not valid for any model of mutation selec- 
tion. The formula for the quasispecies distribution seems to depend on some specific 
assumptions, for instance it is crucial that mutations occur only during a reproduc- 
tion event. In the well-studied Crow—Kimura model, mutations and reproduction 
events are decoupled, and the equilibrium quasispecies equation is different from the 
one we consider here, so we choose not to speak about this class of models. How- 
ever, several very interesting mathematical works have investigated the quasispecies 
theory within the framework of the Crow—Kimura model with the aim of finding 
formulas for the quasispecies. An important mathematical contribution is the papers 
[8, 45], where general criteria for the existence of an error threshold are discussed. 
In [8], the quasispecies equilibrium is characterized with the help of an approximate 
variational principle, under adequate approximation hypothesis on the mutation and 
reproduction rates, both for the Crow—Kimura and the Eigen model. These results 
are more general and robust than the results we present here. However, as noted in 
[12, 13], one has to check carefully the limiting procedures required to apply these 
results. In a series of recent works, Bratus, Novozhilov and Semenov [12, 13, 78] and 
Novozhilov and Semenov [79, 80, 81] study the quasispecies equation. They derive 
many interesting properties of the solutions under some symmetry assumptions on 
the fitness landscape, using a spectral approach. Their framework is also more gen- 
eral than ours. In the case of the sharp peak landscape for the Crow—Kimura model, 
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they manage in [12, 79] to obtain exact expressions of the quasispecies distribution, 
with a wealth of additional information concerning the speed of convergence and 
the error threshold. We believe that the work we do here for Eigen’s model can also 
be done for the Crow—Kimura model. More precisely, each finite population model 
considered here has its counterpart, in the form of a model where the mutations and 
the reproduction events are decoupled, and the equilibrium equation of the Crow-— 
Kimura model should be recovered in some adequate asymptotic regime. In this 
scheme, the counterpart to the combinatorial formulas for the asymptotic quasis- 
pecies distribution are to be found in [12, 79]. Finally, the papers [78, 80] implement 
an approach similar to [12, 79] in order to solve the quasispecies equation of Eigen’s 
model with various fitness landscapes. In the case of the sharp peak landscape, they 
also obtain a beautiful exact formula for the solution to the quasispecies equation, 
valid for any genotype length. 

Let us finally describe the structure of the text. The quasispecies equation consti- 
tutes the backbone of the exposition, it is introduced in part I and all the subsequent 
parts are closely related to it. The central results are presented in parts II and IIL. 
In part II, the sharp peak landscape is introduced. We explain the error threshold 
phenomenon and we give an explicit formula for the quasispecies distribution. While 
part II deals with Eigen’s model, in part II] we present the counterpart of the error 
threshold in classical finite population models, namely the Wright—Fisher model and 
the Moran model. A full detailed proof of the main result for the Wright—Fisher 
model is given in part IV. In part V, we consider class-dependent fitness landscapes, 
which give rise to generalized quasispecies distributions. Part VI deals with the 
dynamical aspects of the models. 


Part I 
Finite Genotype Space 


Overview of Part I 


Instead of starting by introducing the different models, we begin by giving the 
equilibrium equation or quasispecies equation right away, which can be derived in 
a very simple manner, just by thinking about what mutation-selection equilibrium 
must mean. Our first chapter focuses on solving the equilibrium equation, that is, on 
characterizing it as a function of the selection and mutation parameters. Chapters 3 
and 4 introduce the different models we will deal with in the rest of the text. In 
chapter 3, we introduce three models, the common feature of all these models being 
that the different generations do not overlap, in particular the time is discrete. The 
first of them is the Moran—Kingman model, where the population is taken to be 
infinite. The second model is the Galton—Watson model, where the population is 
finite, but varies over time, while the third is the Wright—Fisher model, where the 
population is also finite, but constant over time. In chapter 4, we introduce three 
continuous time models with overlapping generations, namely: Eigen’s model for 
an infinite population, the continuous branching process for a finite population with 
variable size, and the Moran model for a finite and constant-size population. The 
common features shared by all the models we consider are: 

e The population is well mixed, that is, there is no geographical structure, and 
the proportions of the different genotypes suffice to give a full description of a 
population. 

e Individuals die at reproduction, either their own, or some other individuals’. 

e Mutations happen during reproduction. 

In addition to introducing the models, we also show in chapters 3 and 4 how the 
quasispecies equation arises in all of these models. 


Chapter 2 ® | 
The Quasispecies Equation ome 


In this chapter, we first introduce the general quasispecies equation. We then present 
the classical Perron—Frobenius theorem and apply it to solve the quasispecies equa- 
tion in the case where the set of genotypes is finite, under some additional assump- 
tions. 


2.1 The Equilibrium Equation 


We consider a population of individuals evolving under the conjugate effects of 
mutation and selection. Individuals reproduce, yet the reproduction mechanism is 
error-prone, and mutations occur constantly. These mutations drive the genotypes 
away from the current equilibrium. 

Let us introduce some notation in order to describe the model precisely. We 
denote by E the set of the possible genotypes (the set E might be finite of infinite). 
Generic elements of E are denoted by the letters u,v. The Darwinian fitness of an 
individual having genotype u is denoted by A(u), and can be thought of as its mean 
number of offspring. Let us denote by c(u), u € E, the fraction of individuals of type 
u in the population at equilibrium. Without mutations, the quantity c(u) would be 
proportional to A(u). When mutations occur in each reproduction cycle, an individual 
of type u might appear as the result of mutations from offspring of other types. Let 
us denote by M(v, u) the probability that the offspring of an individual of type v is of 
type u. We call (M(u, v), u, v € E) the mutation matrix. Of course, we have 


Vue E SY) M(v, 0) = 1. 


At equilibrium, the fraction c(u) of individuals of type u in the population has to be 
proportional to the mean production of individuals of type u, that is, there exists an 
a > O such that 
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VueE c(u) = a)" c(v)AW)MQ, u). 


veE 


Summing these equations over u, we get 


Le a)" cv)A0). 


veE 


The sum on the right-hand side represents the mean fitness of the population at 
equilibrium. Therefore a has to be equal to the inverse of the mean fitness of 
the population at equilibrium, and we conclude that the fractions c(u) satisfy the 
following set of equations: 


VueE c(u) SEONG) = SY) c(o)AW)M(v, u) (2.1) 


veE veE 


subject to the constraint 


YVueE  c(u)>0, Yi clu) = 1. (2) 


ucE 


In chapters 3 and 4, we will show how these equations characterize the equilibrium 
in several classical models in population genetics and mathematical biology. One 
of these models is Eigen’s model [31], who studied the equilibrium equations in 
detail, and found that for certain choices of the selection and mutation parameters 
A and M, the above equilibrium has the following feature: the fittest genotype has 
a positive but possibly low concentration, and the mutants that are close to the 
fittest genotypes have positive concentrations too. Eigen and Schuster [33] coined 
the term quasispecies in order to describe this kind of equilibrium, as opposed to 
a species, where the fittest genotype would have a proportion close to 1. Due to 
the relevance of the concept of quasispecies in several areas of biology, we shall 
refer to the system of equations (2.1) as the quasispecies equation or the equilibrium 
equation. From part IJ onwards, we focus on the particular choices of A and M that 
are more pertinent from the quasispecies perspective, but before doing so we make 
an attempt at solving the equilibrium equation for arbitrary A and M. Unfortunately, 
the quasispecies equation cannot be solved analytically in general. We shall therefore 
focus on a more specific framework. Throughout this chapter, we consider the case 
where the space of genotypes E is a finite set. The case where E is infinite is much 
more delicate and mathematically challenging, and it cannot be analyzed in full 
generality. 


2.2 The Perron—Frobenius Theorem 


When the space of genotypes is finite, the key tool to solve the quasispecies equa- 
tion (2.1) is the famous Perron—Frobenius theorem [82]. We state here a simplified 
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version for symmetric matrices, which will be enough for our purposes, and for 
which the proof is considerably simpler than for the general case. 


Proposition 2.1 Let B be a square matrix, which is symmetric, and whose entries are 
all positive. Then its eigenvalues are real, the largest eigenvalue A is positive, and the 
corresponding eigenspace has dimension one. Moreover there exists an eigenvector 
associated to A whose coordinates are all positive. Finally any eigenvector of B 
whose coordinates are all non-negative is associated to A. 


Proof Since B is symmetric and real, all its eigenvalues are real. The sum of its 
eigenvalues is equal to its trace, which is positive, thus the largest eigenvalue of B is 
positive, let us call it A. Let y = (y(u))yce be an eigenvector associated to A: 


VueE Ay(u) = SY) Blu, v)y(v) . 


veE 


We can assume that the Euclidean norm of y is 1, i.e., (y, y) = 1, where ¢-, -) denotes 
the standard scalar product in R®. Multiplying the previous equation by y(w) and 
summing over u € E, we get 


A=) yw)Blu, v)y@) = (y, By). 


u,vEeE 


Let us denote by |y| the vector (|y(w)|)uee. Since the entries of B are positive, we 
deduce from the previous identity that 


A = (y, By) < (yl Blyl) < sup (z, Bz). 


Z(z,z)=1 


However, since B is symmetric and real, the last supremum is precisely equal to A. 
Therefore all the previous inequalities were in fact equalities. Since all the entries 
of B are positive, we conclude that all the entries of y have the same sign. The 
eigenvector identity implies furthermore that no entry of y is null. So far, we have 
proved that an eigenvector associated to 4 has all its entries positive, or all negative, 
and none of them is zero. Let y, z be two eigenvectors associated to A. We choose 
a real number a so that the first coordinate of y — az vanishes. Since we have 
B(y — @z) = A(y — az), necessarily y — az = 0. Thus the eigenspace associated to 
A has dimension one. Finally, let y be an eigenvector associated to A with positive 
coordinates and let z be another eigenvector of B with non-negative coordinates, 
associated to an eigenvalue pu. The eigenvalue identity implies that jz is positive and 
that all the coordinates of z are positive as well. We can then find a > 0 sufficiently 
small so that z(u) > ay(u) for u € E. We have then, for any n > 1, 


(z, B"z) = w"{z,z) > (ay, B™ay) = a A"{y,y). 


Sending v to infinity, we conclude that uw > A, therefore p= A. oO 
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2.3 Solutions 


We shall now use the Perron—Frobenius theorem to solve the quasispecies equation 
in the case where the set of genotypes is finite, and under some additional hypoth- 
esis. Suppose that (c(uv))yce is a solution to the system (2.1) which satisfies the 
constraint (2.2). Let 2 be the mean fitness, given by 


A= SONO) 


veE 


and let us set d(v) = A(v)c(v) for v € E. These new variables satisfy 


VueE  aAd(u) = SON OMG u)VA(u). (2.3) 


veE 


Therefore (d(u)),ce is an eigenvector of the matrix A(v)M(v, u)~A(u). The question 
of the existence and uniqueness of the solutions will be settled with the help of a 
result from linear algebra and the following hypothesis. 


Hypothesis 2.2 We suppose that the genotype space E is finite, that the fitness 
function A is positive, that the mutation matrix M is symmetric and that all its entries 
are positive. 


Suppose that hypothesis 2.2 holds. We can apply proposition 2.1 to the matrix 


B(u,v) = VA(v)M(v, u)VA(u). 


If (d(u)),ce is a solution to (2.3) with non-negative entries, then A has to be the 
largest eigenvalue of B and (d(u)),ce is an eigenvector associated to A. Since the 
corresponding eigenspace has dimension one, there is a unique choice satisfying the 
constraint (2.2). Therefore, under hypothesis 2.2, the system (2.1) admits a unique 
solution satisfying the constraint (2.2). In fact, this result still holds if we relax the 
hypothesis that the mutation matrix M is symmetric. We would then make appeal to 
the general Perron—Frobenius theorem [82] to get the conclusion. 


Notation. Throughout part I, we assume that hypothesis 2.2 holds. We define the 
matrix W by setting 


Vu,vekE W(u,v) = A(u)M(u, v) . (2.4) 


We call the matrix W the mean reproduction matrix. For u, v € E, the quantity W(u, v) 
represents the mean number of offspring of type v produced by an individual of type 
u. We denote by A the Perron—Frobenius eigenvalue of the matrix W and by c* the 
associated positive left eigenvector, normalized so that the sum of its components 
is equal to 1. The vector c* is the unique solution of the quasispecies equation (2.1) 
which satisfies the constraint (2.2). The link between the quasispecies equation and 
the Perron—Frobenius eigenvector has been known for a long time and it is used in 
many works, for instance [77, 78, 86]. 


Chapter 3 ® | 
Non-Overlapping Generations oo 


In this chapter, we present three models of population genetics, namely the Moran— 
Kingman model, the Galton—Watson model, and the Wright—Fisher model. We show 
how to relate them with the quasispecies equation. A fundamental feature shared by 
these three models is that their successive generations are non-overlapping, meaning 
that the whole population is fully resampled from one generation to the next. 


3.1 The Moran-Kingman Model 


We begin by introducing the linear model, one of the simplest models for the evolution 
of a population with selection and mutation. Let us denote by N,,(u) the number of 
individuals of type u in the generation n. The linear model assumes that an individual 
of type v produces offspring at a rate proportional to its fitness A(v), and that a 
proportion M(v, u) of the offspring mutates and becomes of type u, thus Nn+1(u) is 
given by the formula 


VueE  Nasi(u) = > Nn(v)A(v)M(v, 1) . 


veE 


The trouble with this formula is that the sum is not necessarily an integer. To get 
around this problem, a natural approach is to develop stochastic population models, 
in such a way that the above formula describes the evolution of the mean number 
of individuals. The archetype of this kind of model is the Galton—Watson branching 
process, which is the object of the next section 3.2. If we introduce in addition 
a constraint on the total size of the population, then we would get the classical 
Wright—Fisher model, which is introduced in section 3.3. Yet the randomness adds 
an extra layer of complexity and stochastic models are considerably harder to study. 
Another simpler possibility is to consider the proportions of each type of individual 
in the population, instead of their numbers, as Moran and Kingman did in the late 
seventies [57, 65, 66]. Let us denote by c,,(u) the proportion of individuals of type u 
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in the generation n. The model proposed by Moran is given by 

Dovee Cn(v)AC)M(v, u) 
Dvee Cn(v)A(v) 


We call this model the Moran—Kingman model; it is not to be confused with the 
well-known Moran model, which is a stochastic model for the evolution of a finite 
population. Let us introduce an adequate framework to study this model. We consider 
the finite-dimensional simplex S, 


gS fee 101: yc =I}, 


ucE 


VueE Cn+i(u) = (3.1) 


and we define a map ® from S to S by 


Lovee CU)ACU)M(v, u) 
Loee C(v)A(v) 


The Moran—Kingman model is the dynamical system on S defined by the iteration 
of the map ®: 


VueE @(c)(u) = (3.2) 


Vn >0 Cra = Op). (3.3) 


The main result concerning the Moran—Kingman model is given in the following 
theorem. 


Theorem 3.1 The dynamical system (3.1) has a unique fixed point, which coincides 
with c*. Moreover, for every c € S, the dynamical system (3.1) with initial condition 
Co = c converges to c’, i.e., 

lim c, = c*. 


n—-oco 


Proof The equilibrium equation for the dynamical system (3.1) is the fundamental 
equation (2.1), and, as we have seen, it has a unique solution on the simplex S, given 
by the vector c*. Using the mean reproduction matrix W defined in (2.4), we have 


(coW”)(ut) 


Vn>1 VueE C,(u) = lcoW"], 
coW" | 


where |c|; is the sum of the absolute values of the components of the vector c, i.e., 


VeeRE ch = Si Leu) |. 


ucE 


The asymptotic behavior of the powers of the matrix W is given by 
: 1 ok * 
Vu,vekE lim qn W's v) = d*(u)c*(v), 
where 2 is the Perron—Frobenius eigenvalue of the matrix W, and d* the right 


eigenvector associated to it, normalized so that the scalar product of d* and c* is 
equal to 1. For a proof of this result, see for instance theorem 1.2 in [82]. We deduce 
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that, for any co € S, 


(CoW")(U) _ _ XMovee Colv)d*(u)c*w) 


lim c,(u) = lim = oO OO Oz (Un), 
oe ee TcoWh Samer coe toyerw) © 
and this is the desired result. oO 


Theorem 3.1 was proven by Moran himself in [65], by using the Perron—Frobenius 
theorem. The rest of Moran’s work [65, 66] deals with Kimura and Ohta’s ladder 
model [71], i-e., E is taken to be the set of the integers Z, and the only mutations that 
are allowed to happen are those between integers at distance 1. Kingman [57] goes 
further and gives a sufficient condition on the fitness function A for the above result 
to hold even when the set E is infinite. 


3.2 The Galton—Watson Model 


The most basic model for the evolution of a population is the Galton—Watson model, 
historically introduced to understand why some natural populations become extinct. 
In order to model the simultaneous evolution of all the genotypes in a population, 
we consider a multitype Galton—Watson process. In the multitype Galton—Watson 
process, at each generation, each individual reproduces independently of all the other 
individuals in the generation, as well as of the past of the process. An individual 
of type u produces a random number of offspring, distributed according to a law 
Lt“, which depends on u. The offspring then mutate independently according to the 
mutation matrix M. The ensemble of the offspring after mutation form the next 
generation. We proceed next with the formal definition of the process. Our multitype 
Galton—Watson model is a Markov chain with values in NE, 


Xn = (Xn(u),u € E), n=0, 


where X,,(u) represents the number of individuals of type u in the generation n. 
Suppose that, for each u € E, we are given a probability measure yu” on the non- 
negative integers satisfying the following assumptions: 


© (0) + (L) <1, 


0) kult(k) = Alu), 

k>0 
e a kK u(k) < 0. 

k>0 
The probability distribution yu“ is called the reproduction law of the type u, because 
whenever an individual of type u reproduces in the Galton—Watson process, the 
number of offspring that are born has distribution yu. Moreover, each offspring of 
an individual of type u mutates independently according to the mutation matrix M. 
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For a vector N € NE, we define |N|, as the sum of its components, i-e., 


VNeENF IN| = SiN). 


ucE 


The distribution p“ of the offspring of an individual of type u is given by 


M(u, v)V@ 


VNENF  p(N) = 2 (INNIS) | | N(v)! 


veE 


Conditionally on X, = Nn € NE, the vector X,,41 is the sum of |Nnl, independent 
random vectors, N,(u) of the random vectors having distribution p", for each u in 
E. A classical reference for the theory of Galton—Watson processes is the book of 
Athreya and Ney [6]. 

Now, the question is: how is this multitype Galton—Watson process related to the 
quasispecies equation? The key to answer this question is the mean matrix of the 
process, whose entries are equal to 


E(Xj(v) | Xo =e(u)), uve, 


where e(u) is the vector with the coordinate corresponding to u equal to 1, and the 
other coordinates equal to 0. The above quantity corresponds to the average number 
of type v children that an individual of type u has. 


Lemma 3.2 The mean matrix of the process is equal to the mean reproduction matrix 
W, i.e, 


Vu,veE E(Xi(v) | Xo = e(u)) = W(u,v) = A(u)M(u, v). 


Proof Intuitively, the average number of offspring of type v of a parent of type u is 
the average number of children in total, A(w), times the average proportion of them 
that mutate to the type v. Indeed, let u, v € E. We decompose the expectation in the 
mean matrix according to the number of children of the individual of type u: 


E(Xi(v)|Xo = e(@)) = >) HCE (Xi(v) | Xo = ew), IXil = &). 
k>0 


Given that Xo = e(u) and that |X, |; = k, the random variable X;(v) follows a binomial 
law of parameters k and M(u, v), the expectation of which is kKM(u, v), thus, 


E(Xi(v)|Xo = e@)) =D) eK) EMG, 0) = A@)M(, 0), 
k>0 
as announced. Oo 
The null vector is an absorbing state for the Galton—Watson process. We say that 


the Galton—Watson process dies out if it is absorbed into the null vector, and that it 
survives indefinitely if 
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Vn>1 AueE Xn(u) > 1. 


The above event is called the survival event. With our assumption on the mutation 
matrix, namely that any type u can create any other type v through a mutation event, 
it turns out that one of the following two cases occurs: 

e either the probability of survival is null for any choice of Xo, 

e or the probability of survival is positive for any choice of Xo, as long as Xo # 0. 
The following lemma ([6], theorem V.3.2) gives the necessary and sufficient condi- 
tions for the survival of the Galton—Watson process. We recall that A is the Perron— 
Frobenius eigenvalue of the matrix W (see section 2.3). 


Lemma 3.3 /f A < 1, the Galton—Watson process (Xn)n>0 dies out with probability 
one. If A > 1, the Galton—Watson process has a positive probability of survival. 


Finally, as a direct application of theorem V.6.1 in [6], we show that, when the 
process survives, the asymptotic concentrations of the different types converge to 
the unique solution c* of the quasispecies equation (2.1). 


Theorem 3.4 Assume A > 1. Conditionally on the survival event, we have 


lim — = ¢* 
n—oo IXnli 


Proof We know from theorems V.6.1 and V.6.2 in [6] that, almost surely, 


where Z is a non-negative real random variable satisfying 
P(Z =0|Xo = e(u)) = P(An > 0 such that X, =0|Xo = e(u)) . 


Moreover the vector c* is already normalized so that |c*|,; = 1. Thus, conditionally 
on the survival event, 


. a" Xn AEDs P 
lim = — = lim —c =c , 
neo |Xqlp ne [Xplp AM n> [Xyh1 
because this is a sequence of vectors whose | - |; norm is equal to one. oO 


The link between the quasispecies equation and the multitype Galton—Watson process 
is already present in the article by Demetrius, Schuster and Sigmund [25], in a 
similar form to the one presented above. Further models of viral evolutions based on 
branching processes are developed and analyzed in the works of Antoneli, Bosco, 
Castro and Janini [4, 5]. These models are more refined, they include a phenotypic 
model and they exhibit several regimes, yielding new insights on the dynamics of 
evolving virus populations. Recently a computational platform, the ENVELOPE 
program, has been designed to simulate this kind of model [36]. 
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3.3 The Wright-Fisher Model 


The Wright—Fisher model is perhaps the most fundamental and well-known model of 
mathematical genetics. Although its definition is elementary, it provides a convenient 
framework to set basic questions in evolutionary theory and it leads to complicated 
mathematical problems. The Wright—Fisher model is a discrete time model, with 
successive non-overlapping generations having a constant size, denoted by m. The n- 
th generation is denoted by X,,. The (n+ 1)-th generation X,,,; depends on X,, only and 
itis built as follows. We sample independently m individuals from X,,, the probability 
for an individual to be chosen being proportional to its fitness. These m individuals 
undergo mutations and give rise to the generation n+ 1, which is the population Xn+1 
(the transition mechanism of the Wright—Fisher process is schematically represented 
in figure 3.1). Mathematically, the Wright—Fisher model is a discrete time Markov 


generation n generation n+ 1 


AA 
il 


selection mutation 


Fig. 3.1 Transition mechanism of the Wright—Fisher process. 


chain (X,)nen with state space E”. Formally, the transition mechanism from X,, to 
Xn+1 is given by the formula 


— AGU) _ a if 

P(Xn+1 = Y|Xn =X) = x(j), yi) 

( | it l<j<m > Aah) A(x(h)) “ 
l<h<m 


Let us explain this formula. The product oni corresponds to m independent samples. 
For each sample, we first select randomly an individual in the population x according 
to the fitness function A, the probability of selecting the j-th individual x(j) in the 
population x is 


A(x) 


SY) Alxh)) 


l<h<m 
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Once an individual is selected, say for instance the j-th individual, it undergoes its 
reproduction cycle and several mutations might appear, the probability that it gives 
birth to the individual y(i) is M(x(j), y@). 

The concentration process (C,)nen associated to the Wright—Fisher process is 
defined by 


1 
¥n20 WucE Cala) = —|{1<ism:Xn() =u}, 
m 


The quantity C,,(u) is the fraction of individuals of type u in the n-th generation. The 
process (C,,)nen takes its values in the simplex S of dimension |E|, defined by 


S= {ee [01J: >) ew =I}, 


ucE 


more precisely in the subset S,,, of S defined by 
ee 
Sm = SAN. 
m 


The transition mechanism from C,, to C,,4; in the Wright—Fisher model depends on 
Xn only through C,,. To see this, let us define the mapping f : E” — S,, by setting 


VxeEE” VueE fly) = {1 <i sm: xt) =u}]. 


The collection of sets f~!({c}), c € Sm, forms a partition of E”. Let c € Sp, 
x € f-'({c}), and let g be any function on E. On one hand, we have 


d) sa) = mY cwew. 


l<h<m uéeeE 
On the other hand, 
[] e@@) = []emr. 
l<i<m veE 


Therefore, for any c,c’ € Sm and x € f~!({c}), we have 


mc’(v) 
ueE c(ujA(u)M(u, v) 


. ¥ 
Pua = 2 | Xe oo | Fe F 
a | | I Due HAC) 


The crucial point is that this last expression does not depend on the choice of x 
in f~'({c’}). We can therefore apply the lumping theorem (theorem A.3 in the 
appendix) to conclude that the process (Cy)new is still a Markov chain. Noting that 
the cardinality of the set f~'({c’}) is given by the multinomial coefficient 


m!| 


Toce(me’(v))! ” 
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the transition mechanism of the process (C,,),en~ can be described as follows: con- 
ditionally on C, = c € Sy, the law of C,,+; is the normalized multinomial law 
4 Mult(m, ®(c)), where the multidimensional parameter (c) is given by 


Doce CCU)ACU)M(, w) 
Yoce (vA) 


This map © is precisely the map arising in the Moran—Kingman model, see for- 
mula (3.2). The hypothesis 2.2 on the mutation kernel M and the fitness function 
A guarantees that the transition mechanism of (C,,)nen is irreducible. In addition, 
its state space S,, is finite. From a classical result of Markov chain theory (see for 
instance [68]), the process (C,,),en admits a unique invariant probability measure, 
which we denote by v,,. As before, we denote by c* the unique solution to the qua- 
sispecies equation (2.1) which belongs to the simplex S. The next theorem reveals 
the link between the quasispecies equation and the Wright—Fisher model. 


VueE M(c)(u) = (3.4) 


Theorem 3.5 As the population size m goes to ov, the invariant probability measure 
Vm converges weakly towards the Dirac mass on c*, i.e., for any continuous function 


f: SOR 
lim, [ fle)drm(c) = Fle’). 
mo Ss 


Proof By definition of the invariant measure, the probability v,, satisfies the follow- 
ing equations: 


Vd€ Sm  Vm(d) = >, Vm(C) Prob(Mult(m, ®(c)) = md) . 


cESm 


The space S is a compact and separable space, therefore the space of the probability 
measures over S is sequentially compact for the weak convergence of measures (see 
for instance [11], chapter 1, section 6). Let us examine what are the possible values 
for the limit of a subsequence of (V,)m>1. Let f be a continuous real-valued function 
defined on S. Our next goal is to prove that 


sim, | i, Fld) dvp(d) - i food) = 0. 


In order to do so, we rely on the previous identity and the fact that the normalized 
multinomial law (which can be seen as the mean of m i.i.d. random vectors) will 
concentrate around its mean as m goes to oo, thanks to the law of large numbers. 
Now this mean is precisely given by the mapping ®, and this will yield the desired 
result. Let us fix ¢ > 0. The simplex S being compact, the function f is uniformly 
continuous on S. Thus there exists 6 > 0 such that 


VedeS lc-d|<6 = |f(c)-f(@|<e. 


We write 
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i: f(d) dvm(d) - I F((c)) dyn(c) 
S S 
= >) £@m(d) = DY) fC) m(o) 


déeSm cESm 
7 > f(d) >» Vm(c)Prob(Mult(m, ®(c)) = md) - ey f(P(c))vm(c) 
déeSm cEeSm cEeSm 
= = Vm(C) , (f(d) — f(®(c)))Prob(Mult(m, ®(c)) = md) . 
cESm déESm 


We decompose the sum over d as 
>, siete >, ae 
deSm déSm, |d-O(c)|>5 deSm, |d-®(c)|<6 


We bound each term separately to get 


< 


>) (F@ ~ f(@(c))) Prob(Mult(m, ®(c)) = md) 


deSm 


Z| F Ico Prob(|Mult(n, @(c)) - m®(c)| > m6) +e. 
Next, we have 


|Mult(m, B(c)) — m®(c)| < », |Mult(m, ®(c))(u) — m®(c)(u)] . 
ucE 


In fact, the u-component of the random vector Mult(m, ®(c)) is distributed according 
to a binomial law of parameters m, ®(c)(u), however these components are not 
independent. Denoting by Bin(m, p) a binomial random variable with parameters 
m, p, we have therefore 


Prob (|[Mult(, (c)) = mO(c)| > m6) 
< Prob(Su EE [Bin(m, &(c)(w)) — m&(c)(u)| > =) 
py Prob([Bin(m, (c)(u)) — m&(c)(w)| > me) 


ucE IE! 


We use Hoeffding’s inequality A.4 (see the appendix) to control the probability 
appearing in the last sum and we get 


2m6é2 


Prob(|Mult(m, ®(c)) — m@(c)| > m5) < 2IE| exp ( - nee 


Plugging these inequalities in the sum over c, we obtain 
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2mé* 
< 4lifllolElexp(- Er) + ©. 


| I fd) dym(d) - I f(@(C)) dvin(c) 
S S 


Passing to the limit in the above inequality and sending e to 0, we obtain that 


lim, | i. f(d) dvp(d) - i fe) drato = 0. 


Suppose now that v* is a probability measure on S which is the weak limit of a 
subsequence of (¥)m>1- From the previous result, applied to this subsequence, we 
see that, for any continuous function f on S, the measure y* satisfies 


I f(d)dv"(d) = I F(®(c)) dv"(c). 
S S 


Thus the probability measure v* is invariant under the map ©. Iterating the previous 
identity, we obtain 


Vn >1 [roar = [roroaro. 


By theorem 3.1, for any c € S, the sequence (®"(c))nen converges towards c*. 
Applying the dominated convergence theorem to the right integral, we conclude that 


f fld)dv"(d) = fc’). 


This is true for any continuous function f on S, therefore v* is the Dirac mass on 
c*. Thus any converging subsequence of (V)m>1 converges to the Dirac mass on c’*. 
By compactness, we conclude that the whole sequence (Vm)m>1 converges weakly 
towards the Dirac mass on c”*. oO 


Directly expressing the result of theorem 3.5 with the process (Cy)nen, we get 


Ve>0 Wee m lim P([Cn(w) — e*(w) . e) £05 


li 

m—-oo n-oo 

This means that, as the population size m goes to oo, the genotypic composition of 
the population in the Wright—Fisher process at equilibrium converges in probability 
to the solution c* of the quasispecies equation. 


Chapter 4 ® | 
Overlapping Generations oo 


In this chapter, we present three further models of population genetics, namely the 
Eigen model, the continuous branching model and the Moran model. We show how 
to relate them with the quasispecies equation. A fundamental feature shared by these 
three models is that their successive generations are overlapping, in fact only one 
individual changes at a time. This may not be so obvious to see directly for the Eigen 
model, however the Eigen model can be obtained as the infinite population limit of 
the Moran model [23]. 


4.1 The Eigen Model 


In his seminal work [31], Manfred Eigen introduced a model for the time evolution 
of a population of macromolecules. The different types of macromolecules corre- 
spond to the elements of the set E. The evolution of each type of macromolecule is 
governed by a set of chemical reactions, which account respectively for the creation, 
degradation and replication of the macromolecule. The speed of replication of a 
macromolecule depends on its fitness, it is encoded in the function A. The repli- 
cation is error-prone and it gives rise to mutations, the mutation probabilities are 
encoded in the matrix M. We denote by c;(u) the concentration of macromolecules 
of type u at time t. Expressing the rate laws of the chemical reactions, Eigen obtained 
the following system of differential equations: 


dc;(u) = 


VueE = 
“ dt 


>) cr(oJAW)MW, 0) — or(u) p(t). (4.1) 


veE 


The first term on the right-hand side accounts for the creation of macromolecules of 
type u, while the second term corresponds to their degradation. Eigen had in mind 
an experimental setup where the total concentration of the macromolecules is kept 
constant (the chemostat), and this led him to choose for ¢(t) the mean fitness of the 
population, i.e., 
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g(t) = Di) or(v)AQ). 


veE 


Thus the classical Eigen model reads 


de,(u) _ 


VueE = 
dt 


Yi cr(vyA(v)M(v, 2) = cru) DV er@W)AW). (4.2) 
veE 


veE 


We will always suppose that the total concentration of the macromolecules at time 0 


is equal to 1, ie., 
SY) cou) = 1. 


ucE 


The solution (c;);+0 of the differential system (4.2) takes its values in the simplex S 
of dimension |E|, defined by 


S= {ce [0,1]: )yc@) =I}. 


ucE 


As before, we denote by c* the unique solution to the quasispecies equation (2.1) 
which belongs to the simplex S. 


Theorem 4.1 The differential system (4.2) admits a unique equilibrium point, which 
is c*. Moreover, for every c € S, the system (4.2) with initial condition co = c 
converges to c*, i.é., 

lim c¢; = c’. 

t—0o 


Proof The equilibrium equation for the dynamical system (4.2) is the quasispecies 
equation (2.1). As we have seen, it has a unique solution in the simplex S, given by 
the vector c*. The solution of the system (4.2) can be written 


coe? 


CG, = , 
" leoeW* |, 


where e™’ represents the matrix exponential, defined by 


The asymptotic behavior of the powers of the matrix W is given by 
: 1 ok * 
Vu,vekE lim an We v) = d*(u)c*(v), 


where J is the Perron—Frobenius eigenvalue of the matrix W and d* is the right 
eigenvector associated to it, normalized so that the scalar product of d* and c* 
equals 1. For a proof of this result, see for instance theorem 1.2 in [82]. Let e > 0 
and choose N large enough so that 
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1 T 
Vn>N aw" - 2°") ! < &, 
where || - || is any matrix norm and (c*)’ is the transpose of the vector c*. We deduce 


from here that 


Wt 
e AT at ; yes T un 
oe = d (c ) <e 4 ri —d* (c ) 
N co 
- w" (At)" a (At)" 
At & ( *\T at 
<e », ca d*(c*) oie +e 2 L ae 


The last term is smaller than ¢. The other term goes to 0 as f goes to oo, hence there 
exists a fg such that 


et 


Vt>t — -d*(c*)"]| < 26. 


We conclude that, for any cg € S, 


eu) Dyce co(v)d*(v)e*(u) 
ee Me petit, Seed eG) 


= c"(u), 


as desired. oO 


Eigen first presented his model in [31], but instead of trying to solve it as above, 
he focused on giving approximate solutions on a particular case: the sharp peak 
landscape; this will be the topic of part II. The result presented above appears in 
several works, we believe that the first occurrences are [51] and [84]. 


4.2 The Continuous Branching Model 


The continuous branching model is the continuous time version of the multitype 
Galton—Watson process of section 3.2. In the continuous time process, individuals 
die at random times, independently of one another. The lifetime of an individual is 
given by an exponential random variable of parameter 1. When an individual of type 
u dies, it produces a random number of offspring, distributed according to the law 
tL" (see section 3.2). These offspring mutate, so the distribution of the offspring of 
an individual of type u is given by the law 


M(u, vy 


VNENE  p*(N) = 2UNID(IN) | | No)! 


veE 


Let us proceed with the more formal definition of the process. The continuous time 
multitype branching process is a stationary Markov process with values in N°, which 
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we denote by 
(X;(u),u € E), t>0. 


Continuous time processes can conveniently be defined through their infinitesimal 
generators. Let e(u) be the vector with the coordinate corresponding to u equal 
to 1, and the other coordinates equal to 0. The infinitesimal generator of the process 
(X;)r>0 is given by 


1 
VuvekE W’(u,v) = lim —(E(K:(0) | Xo = e(u)) - Hes) . 


Let us compute the above expectation for small t. Note that if Y is an exponential 
random variable with parameter 1, 


PY>th=e™, 
so that, by Taylor’s theorem, 
P(Y <t) =t+o(t) and P(Y >t) = 1-t+o(t). 


Likewise, if Yj,..., Yj are N independent exponential random variables with pa- 
rameter 1, 
P(ai #j,¥% <4, ¥; <t) = off). 


Thus, for small ¢, we can neglect the event that more than one individual has died 
before t. If an individual of type u dies, the average number of its offspring which 
are of type v is A(u)M(u, v), hence 


E(X,(v)| Xo = e(u)) = tA(W)M(w, v) + (1 = 1 you} + 0(0)- 


We conclude that the infinitesimal generator W’ is given by W’ = W — I, where W 
is the matrix defined in (2.4) and J is the identity matrix. The mean matrix (M,);>0 
of the process is defined by 


Vt>0 VuveE M,(u,v) = E(X,(v) | Xo = e(u)). 


It can be expressed in terms of the infinitesimal generator W’ as a matrix exponential, 


M, 


Il 
S 
Il 
eg 
: 
= 
~ 


If (A;, 1 < i < |E]) denote the eigenvalues of the matrix W’, the eigenvalues of the 
matrix M, are then given by (e*'’, 1 < i < |E|), and the eigenvectors of the matrices 
M, and W’ coincide. Moreover we know that W’ = W — J, thus the eigenvalues of 
W’ are equal to the eigenvalues of W diminished by 1, and their eigenvectors are the 
same. Let c* be the unique solution to the quasispecies equation (2.1) which belongs 
to the simplex S. We thus have 
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CW = C(W-T) = (A- Ic’. 


As for the Galton—Watson process, we say that the branching process (X;);>0 dies 
out if it is eventually absorbed into the null vector, and that it survives indefinitely if 


Vt>0O dueE X;(u) > 0. 


The probability of survival is either null or positive, and this independently of the 
starting population, as long as it is not void. The following two results are the analogs 
of those in section 3.2, and can be deduced from theorem V.7.5.2 in [6]. 


Lemma 4.2 Jf 2 < 1, the branching process (X;);>09 dies out with probability one. 
If A > 1, the branching process starting from a non-void population has a positive 
probability of survival. 


Theorem 4.3 Assume A > 1. Conditionally on the survival event, we have 


X 
lini —— = ¢ 
too |X|} 


As for the Galton—Watson process, this result is already present in [25]. 


4.3 The Moran Model 


Moran [64] modified the Wright—Fisher model in order to allow overlapping genera- 
tions. Contrary to the Wright—Fisher model, where the whole population is modified 
at each time step, in the Moran model, only one individual can be modified at a 
time. In the discrete time version, at each time step, an individual is selected in the 
population according to its fitness, it produces one offspring, which mutates and 
replaces one individual chosen at random in the population (the transition mecha- 
nism of the Moran process is schematically represented in figure 4.1). For reasons 
of mathematical elegance, we will consider here the continuous time version of the 
Moran model, and we proceed with its formal definition. The continuous time Moran 
model is the Markov process (X;)erx+ having the following infinitesimal generator: 
for ¢ a function from E” to R and for any x € E””, 


lim *(£(#O% Ko =x) - o(2)] . 
DY SAGMMEW.00(6KG — w) - 600), 


1<i,j<mucE 


where x(j < u) denotes the vector x in which the j-th coordinate is set to be equal 
to u. We define the concentration process (C;);¢r+ associated to the Moran process 
(X;)rer+ by setting 
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1 
Vt>0 WweE G,(u) = =| {1 <ism:X,(i)=u}|, 
m 


The quantity C;(u) is the fraction of individuals of type u in the population at time ft. 
The process (C;);er+ takes its values in the simplex S of dimension |E|, defined by 


Se fee [01 >) ew =I}, 


uceE 


and more precisely in the subset S,,, of S defined by 
dae 
Sm = SON. 
m 


As for the Wright—Fisher process, the sum of the rates appearing in the expression of 


time n time n+ 1 


@ @) (9) 
e@ @ 
selection 
mutation 
@ @ 
replacement 
@ @ 
@ e @ 


Fig. 4.1 Transition mechanism of the Moran process. 


the infinitesimal generator depends on x only through the concentration of each type 
in the population x. We can therefore apply the lumping theorem A.3 to conclude 
that the process (C;);er: is still a continuous time Markov chain, whose infinitesimal 
generator L,, is given by the following formula: for f a function from S,, to R and 
for any c € Spy, 


Lmf(e) =D) e(w)A(w)M(w, we(o) 


u,v, w EE 


f(e+ P=") po), 


where (e(u), u € E) is the canonical basis of R-. The hypothesis on the mutation 
kernel M and the fitness function M guarantee that the transition mechanism of 
(C;)+er+ is irreducible. In addition, its state space is finite. Applying a classical 
result of Markov chain theory (see for instance [68]), we conclude that the process 
(C;)reR+ admits a unique invariant probability measure, which we denote by v,,. As 
before, we denote by c* the unique solution to the quasispecies equation (2.1) which 
belongs to the simplex S. 
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Theorem 4.4 As the population size m goes to ov, the invariant probability measure 
Vm converges weakly towards the Dirac mass on c*, i.e., for any continuous function 


f: SOR 
lim [to dvn(c) = f(c*). 
m—-oo Ss 


Proof The invariant probability measure v,, of the process (C;);er+ satisfies 


Vf €C%(S) > Vmn(c) Linf(c) = 0. (4.3) 


cESm 


Here C™(S) denotes the set of the functions defined on the simplex S with values 
in R which are infinitely differentiable. As m goes to ov, the rescaled generator mL), 
converges towards the differential operator L defined as follows: for any f € C°(S), 
for anyc € S, 


= of af 

Lio = 2 wacom Me( sal) -{f (©) 
of af 

= 2 Man 5g 2 Eee 


Let us fix a function f € C°(S). Since S is compact, the previous convergence is in 
fact uniform over S. Now, writing 


[item = [tnt dm + [Cf - minh) dm, 


we obtain, thanks to (4.3), 


[tram 


Since mL,,f converges uniformly towards Lf over S, we conclude that 


< sa |Lf(c) - mLm f(c)| ; 


lim | Lfdvm = 0. (4.5) 
Ss 


Suppose now that v* is a probability measure on S which is the weak limit of a 
subsequence of (Vm)m>1. Taking the limit in the equality (4.5) along the subsequence, 
we see that v* satisfies 


Vf e€C°(S) I Lf dv* = 0. (4.6) 
Ss 
We denote by (®');cg+ the semigroup of transformations on S associated to the 


differential system of Eigen’s model. More precisely, for c € S, the unique solution 
of the system (4.2) with initial condition c at time 0 is given by 
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Vt>0 VueE cr(u) = O'(c)(u). 
Now, if we differentiate f(®'(c)) with respect to the time, we get, using (4.1), 


4 apigye 9 8S ee? 
TOO = Dizm MOG 


ucE 


= » on (®'(c)) SY} er(vJA(W)M(0, w) - ce(u) Y° cr(v)A(v) : 
ucE 


veE veE 


Looking at equation (4.4), we recognize the differential operator L, so that 
d t t 
VeeS Vt>0 Tt? ©) = EF(@e)) . 


We integrate this differential equation with respect to v*. Interchanging the integral 
and the derivative, and using (4.6) with f o ©’, we conclude that 


Vt>0 [rar = [reoar. 
S S 


This is true for any f € C~(S), thus y* is invariant under the semigroup (®*); ep. 
By theorem 4.1, we have 


VceeS lim ®'(c) = c*. 
t—oo 


Applying the dominated convergence theorem, we conclude that 


[far =. 


This is true for any f € C™°(S), thus v* is the Dirac mass on c*. Therefore any 
converging subsequence of (¥j,)m>1 converges to the Dirac mass on c*. By com- 
pactness, we conclude that the whole sequence (v»,)m> 1 converges weakly towards 
the Dirac mass on c*. Oo 


Directly expressing the result of theorem 4.4 with the process (C;);er+, we get 


Ve>0O Vuck im lim P(|Ci(w) —ct(u) 


1 
m—-co t—0oo 


>e) = 0. 


This means that, as the population size m goes to ov, the genotypic composition of 
the population in the Moran process at equilibrium converges in probability to the 
solution c* of the quasispecies equation. 


Chapter 5 ® | 
Probabilistic Representations “ae 


For the six models of evolving populations presented so far, we showed how the 
quasispecies equation pops up in a suitable asymptotic regime. Typically, the qua- 
sispecies equation describes the equilibrium of the models in the limit of infinite 
population size. However a quasispecies can be exactly realized with very simple 
probabilistic models involving one or a finite number of individuals. We present two 
such models in the next sections: first a stopped random walk, and second a stopped 
branching process. These constructions help to improve our understanding of the 
quasispecies structure. Moreover, they will provide a valuable tool to build exact 
formulas in the long chain regime that we explore later. 

We still work within the framework described in chapter 2, namely we consider a 
finite set E of types with a positive fitness function A and a mutation kernel M. Our 
goal is to obtain a probabilistic representation of the equilibrium equation (2.1) under 
the constraint (2.2). Since we are dealing with a finite genotype space, the solution 
of the quasispecies equation is the normalized left Perron—Frobenius eigenvector c* 
of the matrix W = (A(u)M(u, v), u,v € E), as we discussed already at the end of chap- 
ter 2. We shall in fact develop probabilistic representations of the Perron—Frobenius 
eigenvector c*. Although we work with the hypothesis 2.2, these representations are 
valid under the same hypothesis as the Perron—Frobenius theorem [82]. 


5.1 Stopped Random Walk 


Recall that the mutation matrix M is a stochastic matrix. Thus we may consider 
the Markov chain (Z,,),>9 on E which has M for its transition matrix. We think of 
(Zn)n>o as the random walk of a single mutant in the genotype space, and we call it 
the mutant walk. 


Notation. For u ¢€ E, let E,, be the expectation of the Markov chain (Z,,),>0 started 
from u, and let 1, be the time of the first return of the chain to u, defined by 


Ti = inf {w21:Z,=u}. 
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We give now an equation characterizing the Perron—Frobenius eigenvalue of the 
mean reproduction matrix W, as well as two formulas for the associated Perron— 
Frobenius eigenvector, which are built with the help of the Markov chain (Z,),>0. 


Proposition 5.1 Let u € E. The Perron—Frobenius eigenvalue a of the matrix W 
satisfies the equation 


Ty -1 
E, [es [| ne} =1, (5.1) 


k=0 


The left Perron—Frobenius eigenvector c° of the matrix W which satisfies c°(u) = 1 
is given by 


Teal n-1 
WeE cv) = a| > (:n*f Jaco) (5.2) 
n=0 k=0 


Before proving this proposition, we present an interesting alternative formula. 


Corollary 5.2 The normalized left Perron—Frobenius eigenvector c* of the matrix 
W is given by 


Wee c*(u) = x = : 
e| eT nes) 
n=1 k=0 


Proof We normalize the vector y defined in formula (5.2) and we obtain the following 
formula for the vector c*: 


n-1 


2[5 (1z.-4"| | a2) 
_ k=0 


sa (5.3) 


Tu 


2 3) (a [] ne) =e | y (a [a2] 


n=1 k= 
Taking v = uw in formula (5.3), we obtain the formula stated in the corollary. oO 


Before jumping into the proof of proposition 5.1, a couple of remarks are due. 
First, note that these are formulas which are valid for any matrix W with positive 
entries (even for any primitive matrix). Indeed, given a square positive matrix W 
indexed by a finite set E, it suffices to take A(u) to be the sum of the elements in the 
row u, and M(u, v) = W(u, v)/A(u). Second, note that the formula of proposition 5.1 
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is a generalization of the classical formula for the invariant probability measure of 
a Markov chain. Indeed, in the particular case where W is stochastic, A is constant 
equal to 1, 2 is also equal to 1, c* corresponds to the invariant probability measure 
of the Markov chain, and the formula of the proposition becomes the well-known 


formula i 


Eu(Tu) ; 
Third, in the case where A > 1, the factor A™ is naturally interpreted as a killing 
probability. More precisely, let us introduce a random clock tj, independent of the 


mutant walk (Z,,)n>0, and distributed according to the geometric law of parameter 
1-1/a: 


WeeE c*(u) = 


J \n-l 
Vn>=1 P(t =n) = (=) : 
The formula presented in corollary 5.2 can then be rewritten as 


1 


23. n-1 ne] : 


n=0 k=0 


Wee c*(u) = (5.4) 


We now prove proposition 5.1. 


Proof We fix u € E and we note E,, and t, simply by E and T. Let c° be the vector 
defined in formula (5.2). Notice that c°(u) = 1. Thus the vector (c°(v)) ce is non-null 
and its components are non-negative. For w € E, we compute 


) c°(v)A(v)M(v, w) = 


veE 
n-1 
> by e( eave | I] A(Zx)) 1 {Zn=v}A(U)M(v, »] 
k=0 


veE n>0 
= » >: e( ee | I] A(Zx)) 1(z,,=v}1 rane) 
veE n2>0 k=0 
= e( Dy Lem a"| I] A(Zx)) 1 rane) 
n>0 k=0 


t-1 n 
e y Iguazu} 4( I] ne) 
n=0 k=0 
T n-1 
2 Izew"(T | a2) (5.5) 
k=0 


n=1 


Suppose that w # u. Then the term in the sum appearing in (5.5) vanishes for n = 0 
and n = T, and we recover the identity 
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> c°(v)A(v)M(v, w) = Ac°(w). 


veE 


For w = u, the only non-null term in the sum appearing in (5.5) corresponds to 
n =T, and we obtain 


t-1 
DY c°(v)A(v)M(v, u) = A ee I] nes] 


veE k=0 


We develop the last expectation according to the value of 7 as follows: 


t-1 nl 
E [ I] ae) = >, e( (r=n}A” I] nes) 
k=0 k=0 


n>1 


= » » A"A(U)A(01) bles A(vn—1)P(Z1 = U1,.--,L2n-1 = Un-1,Zn = u) 


NZI Vj,-..Un-1 FU 


= S) Si AG)M(, 01) ++ An 1)MOn-1. 4). 5.6) 


n>1 Vj,.--,Un-1 FU 


Recalling that c°(u) = 1, the only thing it remains to prove is that this last sum is 
equal to 1, i.e., that 


1 1 
Www) +5 Dd) WU 01) W(or.u) + 


vy Fu 


1 

ae >, Wu, 01)-+*W(Up-1,u) +2+- = 1. (5.7) 
Vises Uy-| FU 

To prove this identity, we introduce a new Markov chain (Z;,),>0 on E. Let d* be the 

right Perron—Frobenius eigenvector of W, normalized so that the scalar product of 

c* and d* is equal to 1, i.e., the vector d* satisfies Wd* = Ad* and 


Yi cw) a) =1. 
veE 


The transition matrix of the Markov chain (Z%)n>0 is given by 


Vn20 PE, |= w|Z, =o) = Moe) 


Let us denote by E% the expectation of the Markov chain (Z*),>0 started from u, 
and by t;, the time of the first return of the chain to u. The Markov chain (Z})n>0 is 
irreducible and its state space is finite, hence it is recurrent and 
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Moreover, for any n > 1, 


Pid an) SS Pesan 2) 4 Sin) 


U1, ++Un-1 FU 


1 
= Sa » W(u, 01) +++ W(0n-1,4). 


V],---,Un-1 FU 


Combining together the two previous formulas, we obtain the desired identity (5.7). 
Therefore the vector (c°(v), v € E) is an eigenvector of W associated to 2. Moreover 
formulas (5.6) and (5.7) yield the identity (5.1). oO 


5.2 Stopped Branching Process 


The representation formula of proposition 5.1 is still a bit complicated. We shall start 
with a slightly more complicated process, that is a branching process instead of a 
random walk, in order to get a simpler representation formula. Our starting point is a 
Galton—Watson model, similar to the one introduced in section 3.2. More precisely, 
we consider a multitype Galton—Watson model 


Xn = (Xn(u),u € E), n>0, 
with mean matrix given by the mean reproduction matrix W, i.e., 
Vu,vekE E(Xj(v)| Xo = e(u)) = W(u,v). 


Recall that X,,(u) represents the number of individuals of type u present in the n-th 
generation X,, of the process. Let u € E be fixed. We shall stop the process (X,,),>9 on 
the type u by killing the descendants of individuals of type u in any generation n > 2. 
The resulting process is denoted by (X)n>0, and we call it the stopped branching 
process. Thus, in the stopped process (X"),,>0, the individuals reproduce and mutate 
as in the usual Galton—Watson process (X;,)n>0, however from generation 1 onwards, 
the individuals of type u do not produce offspring. 


Notation. We denote by E,, the expectation for the process (X#)n>0 starting with a 
population consisting of one individual of type uw. 


Notice that the initial individual of type u produces offspring as in the Galton—Watson 
process (X,,)n>0, but individuals of type u belonging to the subsequent generations 
are prevented from having offspring. So, in the first generation X", individuals of 
type u belong to the offspring of the initial individual of type uv. In the generation 
Xn for n = 2, individuals of type u belong to the offspring of individuals in X"_, of 
types different from u which have undergone a mutation to become of type u. 


We give now a formula for c* in terms of (X%)n>0 and the Perron—Frobenius eigen- 
value A of the mean reproduction matrix W. 
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Proposition 5.3 The normalized left Perron—Frobenius eigenvector c* of the matrix 
W is given by 


VueE c*(u) = 


ae Ex (Xé(v)) 


n>1 veE 
Notice that 
D7 Eu (Xt(0)) = Eu( YXK(0)) 
veE veE 


is simply the expected size of the population Xj. In the particular case where A is 
constant equal to 1, and we take for the reproduction law the Dirac mass at 1, the 
matrix W is equal to the mutation matrix M, the Perron—Frobenius eigenvalue J is 
equal to | and the process (X;,)n>0 is the mutant walk used in the previous section. 
The stopped process (X‘1),>0 is then the random walk started at u and stopped at the 
time t,, of the first return to u. So, in this situation, the population Xj has size 1 until 
time t,, and size 0 afterwards, therefore 


bS Na Y) Ex Ey, (X“(v)) 


n>1 veE 


2 (x10)] 


n=l veE 


eS 1 = E,(ty) 


n>1 


and we recover again the classical formula for the invariant probability measure 
of a Markov chain. Finally, in the case where A > 1, the factor A~” is naturally 
interpreted as a killing probability, as we did in the case of the stopped random walk. 
We introduce a random clock T,, independent of the branching process (X),>0, and 
distributed according to the geometric law of parameter 1 — 1/A: 


J \n-l 
Vn>=1 P(t =n) = (;) : 


The formula presented in proposition 5.3 can then be rewritten as 


WeeE c*(u) = : (5.8) 


H[EZ<] 


n=1 veE 


The nicest situation is when the Perron—Frobenius eigenvalue A is equal to one. In 
this case, the formula becomes 


Wee c*(u) = 


1 
SST Bu (XK(0)) 


n>lveE 


The denominator is naturally interpreted as the expected number of descendants 
from an individual of type u, for the process in which the descendants of type u are 
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forbidden to reproduce. Let us remark also that, by multiplying the matrix W by a 
constant factor, we can adjust the value of the Perron—Frobenius eigenvalue without 
altering the Perron—Frobenius eigenvector. More precisely, suppose that the mean 
matrix of the Galton—Watson process is given by 


VuvekE E(Xj(v) | Xo = e(u)) = aW(u,v), 


where a is a positive constant. If we take a = 1/A, then we indeed obtain a critical 
branching process whose Perron—Frobenius eigenvalue is 1. Of course there are 
several ways to realize such a process. Here is a natural possibility. The reproduction 
law yu“ of the type u is taken to be the Poisson distribution with parameter A(u)/A. 
After reproduction, the offspring mutate independently according to the mutation 
matrix M. The process (X;,)n>0 is quite similar to the one considered in section 3.2. 
The crucial difference is that, in section 3.2, we were interested in the asymptotic 
behavior of the process in the supercritical case, while here we rescale the mean 
matrix W by J to ensure that the process is exactly critical. 
Let us come to the proof of proposition 5.3. 


Proof We fix u € E and we write E,, simply as E. We define 


VWoeE y(v) = ys "E(Xi(v)) . 


n>1 


Let us first examine y(u). By definition of the stopped process (X)n>0, we have 


y(u) = >, » AT W(u, v1) +++ W(vp-1, 4) « 


n21 Uj,...,Un-1 FU 


From formula (5.7), which was obtained in the course of the proof of proposition 5.1, 
we see that the above sum is in fact equal to 1, thus y(u) = 1. Next let v € E. We 
have 

(YW)(0) =D) yW)Ww,0) = Wo) + D) yw) Ww, 0). 


weE weE\{u} 


We compute 


yy y(w)W(w, v) >, AE (Xi (w))W(w, v) 


weE\{u} weE\{u}n>1 
-> rn"e( y Xi(w)W(w,»)) 
n>1 weE\{u} 


Now, for any n > 1, from the definition of the process (Xi)n>0, we have 


E(Xit.1(0) |X = > x%(w)W(w, 0), 


weE\{u} 


whence 
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Y xw)W(w,0) = Ye (E(x%,0 (|x) 
n21 


weE\{u} 
DPE (X10) 


n>1 


= Ay(v) — E(X}(v)) . 


Remember that the initial individual of type u reproduces as in the Galton—Watson 
process (X,,)n>0, therefore E (x(v)) = W(u, v). Putting together the previous com- 
putations, we obtain 

WEE (yW)(v) = Ay(v). 


Since in addition y(u) = 1, we conclude that all the components of y are positive 
and finite, therefore y is a left Perron—Frobenius eigenvector of W. We normalize it 
to obtain c*, and we obtain 


y(u) 


1 
Loe) — SS" aE (XKCD) 


veEn21 


c*(u) = 


This is the desired formula. oO 


Part II 
The Sharp Peak Landscape 


Overview of Part II 


This part focuses on the original framework treated by Eigen [31] and developed 
further by Eigen and Schuster [33]: the set of genotypes is taken to be the set of 
sequences of fixed finite length over some finite alphabet; mutations are assumed to 
happen independently on each site of the sequence, with equal probability for all sites. 
In chapter 6, we introduce the sharp peak landscape—where a particular sequence 
(called the master sequence) has a higher fitness than the rest—and we define the 
long chain regime. We lump the quasispecies equation over the Hamming classes by 
adding up the concentrations of the sequences at a given Hamming distance from 
the master sequence. This leads to a considerable simplification. We derive a limit 
equation for the quasispecies equation in the long chain regime, which we study 
in detail in chapter 7. Depending on the fitness o of the master sequence and the 
mutation rate a, this limit equation admits one or two solutions, and this leads to 
the error threshold phenomenon. Above a critical mutation rate, the only solution 
is null, while below the critical rate, there exists a non-null stable solution. This 
solution describes the structure of the quasispecies. We give an explicit formula 
for the quasispecies distribution, which we denote by Q(a,a). According to this 
formula, the proportion of sequences at Hamming distance k > 0 from the master 
sequence is 
k k 
Q(c,a\(k) = (ce4 - De > i 
h>1 

This formula possesses a rich combinatorial structure, which we start to explore in 
chapter 9. In particular, we rewrite the formula with the help of the Eulerian numbers 
and the Stirling numbers. Finally, in chapter 10, we consider the Moran—Kingman 
model and the Eigen model in the long chain regime, obtaining infinite versions 
of them. We show how these limit models are linked with the limit quasispecies 
equation. We implement the same scheme for the Perron—Frobenius eigenvector 
associated to the sharp peak landscape. 


Chapter 6 ® | 
Long Chain Regime sical 


Ideally, we would like to have explicit formulas for the mean fitness 2 and the 
equilibrium concentrations c* in terms of A and M. There is little hope of obtaining 
such explicit formulas in the general case, without further assumptions. Therefore, 
we focus on a particular choice of the set of genotypes E and of the mutation matrix 
M. Both for practical and historical reasons, we make the same choice as Eigen did. 
The first part of the text was devoted to the case where the set of genotypes E is finite. 
In practice, the set of possible genotypes of a species is huge, at least it is much 
larger than the size of the population, which is itself very large. Our next goal is to 
attack the more relevant situation where the number of possible genotypes grows to 
infinity, in a judicious scale. This is the long chain regime, already considered by 
Eigen. 


6.1 Genotypes and Mutations 


We present here the simplified biological framework in which we define our toy 
model. In the previous chapter we have used the notation E, M, A for generic finite 
genotype spaces, mutation matrices and fitness functions. We now make a concrete 
choice of a genotype space, mutation matrix and fitness function; in order to avoid 
confusion, we use the notation E, M, A instead to refer to this concrete choice. 


Genotypes. Let A be a finite alphabet and let « = |A| be its cardinality. Let > 1 
be an integer. We consider our genotype space to be E = A°, the space of sequences 
of length ¢ over the alphabet A. Elements of this space represent the chromosome 
of a haploid individual, or equivalently its genotype. In our model, all the genes have 
the same set of alleles and each letter of the alphabet A is a possible allele. Typical 
examples are A = { A,T, G, C } to model standard DNA, or A = { 0, 1 } to deal with 
binary sequences. Generic elements of A‘ will be denoted by the letters u, v, w. The 
space A° is endowed with a natural distance, called the Hamming distance, which 
counts the number of different letters between two sequences: 
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Vu,v€ AY d(u,v) = |{1<is:u@)#o@}]. 


Mutations. The mutation mechanism is the same for all the loci, and mutations 
occur independently. We denote by q € ]0, 1[ the probability that a mutation occurs 
at one particular locus, we will refer to the parameter g as the mutation probability. 
If a mutation occurs, then the letter is replaced randomly by another letter, chosen 
uniformly over the x — 1 remaining letters. Mutations are rare, and the most likely 
outcome for a given letter is to stay unaltered. We should therefore think of qg as 
being small, and we will always suppose that g < 1 — 1/x. As usual, we encode this 
mechanism in a mutation matrix 


Mlu,v), uvEeA’, 


where M(u, v) is the probability that the chromosome u is transformed by mutation 
into the chromosome v. The mutation probability M(u, v) is thus given by 


d(u,v) 
Vu,ve AS M(u,v) = (4 7 ad aig) Oe, 
K- 
With our choice of the mutation scheme, the matrix M is symmetric, thanks to the 
symmetry of the Hamming distance. 


6.2 Sharp Peak Fitness 


We have not specified the fitness function yet. Let us consider first the simplest 
possible scenario, a constant fitness function: A(u) = a > 0 for all u € A*. When 
the fitness function is constant, there is no selection among different genotypes, and 
we say that the landscape is selectively neutral. Under the constraint (2.2), since A 
is constant, 
A= » c(v)A(v) = a. 

veE 
The matrix M is symmetric and also stochastic, that is, each row of the matrix adds 
up to 1. It is thus doubly stochastic, that is, each column of the matrix adds up to 
1 too. We conclude that, for a constant fitness function, the unique solution of the 
system (2.1) satisfying the constraint (2.2) is given by 

1 1 

i eT ee ucA®. 

However, adaptive neutrality is seldom found in biological populations. We thus 
embark on a quest for explicit formulas involving more complex fitness functions. 
The simplest non-neutral fitness function which comes to mind is the sharp peak 
function: there is a privileged genotype w* € A®, referred to as the master sequence 
(also called the wild type in the literature), which has a higher fitness than the rest. 


6.3, Hamming Classes 43 


Let o > | and let the fitness function A be given by 


eae 
VieR Awsd 7 7 ee" 
1 if u#uw*. 


This is the fitness function that Eigen studied in detail in his article [31]. 


6.3 Hamming Classes 


One of the main advantages of working with the sharp peak landscape is that we 
can break the space of genotypes into Hamming classes. For k € {0,...,¢}, the 
Hamming class k, denoted by Hy, is the subset of A‘ containing all the genotypes 
that are at Hamming distance k from the master sequence. Let us define the function 
Ay :{0,...,€} — R* by 


o if k=0 
Vk e{0,...,€ An(k) = : 
: } a”) 1 if k>O0. 
For each k, the value Ay(k) is the fitness common to all the genotypes in the 
Hamming class k. As the next lemma shows, the mutation probabilities can also be 
lumped over the Hamming classes. Let b,c € {0,...,€} and let X, Y be independent 
random variables with binomial distributions 


X ~ Bin(b,g/(k-1)), Y ~ Bin(€-b,q), 


and define 
My(b,c) = P(b-X+Y=c). 


Lemma 6.1 Let b,c € {0,...,¢€}. For any genotype u in the Hamming class b, we 
have 
> M(u,v) = My(b,c). 


veHe 


Proof The quantity )1,<4, M(u, v) is the probability of wu ending up in the class c after 
mutation. We say that a letter in a given genotype is correct or incorrect depending 
on whether it coincides with its counterpart in the master sequence or not. Since u is 
in the Hamming class b, it has b incorrect letters and € — b correct ones. Each letter 
changes state according to a Bernoulli random variable of parameter g. Therefore, 
the law of creating incorrect letters from the correct ones is Bin(€ — b, g). Likewise, 
the law of creating correct letters from the incorrect ones is Bin(b, g/(k — 1)), since 
an incorrect letter changes to the correct one with probability g/(« — 1). Noting that 
these binomial laws are independent of the placement of the correct and incorrect 
letters (and therefore of each other), we get the desired result. oO 
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Let k € {0,...,€}. Adding up the equations of the system (2.1) for u € A and 
rearranging the sums according to the Hamming classes Hy, we get 


> c(u) >, » c(v)A(v) = > > c(v)A(v) o M(v,u). 


ucH, O<h<l vEeH, O<h<l vEeH, ucHy 


We define the new variables y(h),0 <h < @, by 


Vhe{0,...,€}  y(h) = 2, c(u). 


ucHy, 


In view of the above remarks, these new variables satisfy the system 


yk) DY" y)An(h) = >" yh)An(Mn(hk), O<kSE. G1) 
O<h<€ O<h<é 


The number of equations has been dramatically reduced from 2° to € + 1. Moreover 
the new system (6.1) has the same form as (2.1), and therefore all the considerations 
of section 2.3 still hold for this new system. Under the constraint (2.2), the mean 
fitness might be rewritten as 


> y(h)An(h) = (2 - 1y0) +1. 


O<h< 


The above system then becomes 


y(k)((o — 1)y(0) +1) = > yh)Ap(h)My(hk), O<k<€. (62) 
O<h<é 


Bratus, Novozhilov and Semenov [78, 80] have obtained beautiful exact formulas 
for the solution to this equation. Their formulas are valid for any fixed ¢. They have 
an equation satisfied by the mean fitness (that is, (a — 1)y(0) + 1 in our case) and 
a formula for the quasispecies distribution, which depends on the mean fitness (see 
equations (5.4) and (5.5) in [78]). 


6.4 Limit Equation 


Although the system of equations (6.2) is much simpler than the initial one, explicit 
formulas for y are still out of hand (in the formulas of [78], y(O) is characterized 
as the root of a complicated equation). In order to get very simple formulas, we 
consider the asymptotic regime 


€— +00, q- 0, €q > a €]0, +00[ . 
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We will refer to this asymptotic regime as the long chain regime. This asymptotic 
regime, already considered by Eigen, arises naturally when modelling a population 
of individuals with a very long genome, in which the mean number of observed 
mutations per individual in the reproduction cycle is a. Indeed, a variety of living 
organisms have mutation rates that are of the order of the inverse genome length [39]. 
Mathematically, the long chain regime is particularly convenient because the muta- 
tion matrix acquires a simpler form, thanks to the convergence of the Binomial law 
to the Poisson distribution. This fact is illustrated in the next lemma. 


Lemma 6.2 Let b,c = 0. The mutation probability My(b, c) satisfies 


c-b 
a 
a Sys iis 
lim | Mu(b,c) = i ee 
teen 0 if c<b. 


Proof Recall that if X ~ Bin(b,g/(«k — 1)) and Y ~ Bin(¢ — b,q) are independent 
random variables, then 


My(b,c) = P(-X +Y=c-b). 


Since b is fixed, in the long chain regime, the law Bin(, q/(x — 1)) converges to a 
Dirac mass at 0, and the law Bin(€— b, g) converges to a Poisson law of parameter a. 
The formula appearing in the lemma is precisely the probability of a Poisson random 
variable of parameter a being equal to c — b. oO 


In view of this lemma, passing to the limit in the finite system, we obtain the infinite 
system of equations 


k>0. (6.3) 


k- 
y(k)((o — Dy(0) +1) = > y(h)An We" a ak 


O<h<k 


Notice that the new system of equations is triangular, i.e., the k-th equation depends 
only on the variables y(0),..., y(k). 


Chapter 7 ® | 
Error Threshold and Quasispecies ssid 


In this chapter, we study the infinite system obtained at the limit in the long chain 
regime (see section 6.4). This system reads: 


k>0. (7.1) 
A solution y is a probability distribution if it satisfies the additional constraint 


YS) yk) =, (7.2) 
k=0 


7.1 The Error Threshold 


The nice feature of this new system is its triangular form, so that we might try to 
solve it inductively. Let us take a look at the equation for k = 0: 


y(0)((o — Dy(0) +1) = yO)oe*. 
The only two solutions to this equation are 


e?-1 


y(0) = 0, y(0) = 


On one hand, if y(O) = 0, the system of equations (7.1) implies that 


k-h 
Vk>1 yk) = y(hje? (7.3) 
a (k—m! 


and a straightforward induction shows that y is identically 0, yet the null solution 
does not satisfy the constraint (7.2) (with y instead of c). On the other hand, suppose 
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that we take for initial condition y(0) = (7e~ — 1)/(o — 1). Replacing y(0) on the 

left-hand side of (7.1) and dividing by e~* on both sides, the recurrence relation 
becomes 

e _ es k>1 74 

kjo = yO)o— + ——, 21. : 

yo = yYOor 2» em (7.4) 


This recurrence relation completely defines the sequence (y(k))x>09. The terms of 
this sequence are positive if and only if ~e~* > 1. The critical value a = In is often 
referred to as the error threshold. In the long chain regime, the quantity €q is roughly 
a, hence it is the limiting value for the mean number of mutations per chromosome 
in each reproduction cycle. The error threshold is defined as the critical mutation 
rate which allows for the master sequence to be present in a positive proportion. In 
the next section we will show that, when a > Inc, a solution of (7.1) satisfying the 
constraint (7.2) exists, and we will obtain a formula for it. This solution is called a 
quasispecies, and along with the error threshold, it forms the core of Eigen’s work 
on the topic. 

The aim of the subsequent chapters is to study how the concepts of quasispecies 
and error threshold translate to the finite population models introduced in chapters 3 
and 4. 


7.2 The Distribution of the Quasispecies 


We proceed now to solve the system of equations (7.1). As discussed in the previous 
section, we can expect to find a solution satisfying the constraint (7.2) only in the 
case where oe™* > 1 and a4 
y0) = ——. 
ao-1 
So we suppose that we are in this case and we try to solve the recurrence relation (7.4). 
We choose to use the method of generating functions (a beautiful account of this 


method can be found in chapter 7 of [42]). Set 


g(X) = >" y(i)x*. 


k>0 


Using the recurrence relation (7.4), we have 


aX : or k 
g(xje" = Ldap 


k>0 h=0 


k 
=D) (vor = WOO - DS) X* = o9(X) - WOO - De. 
k>0 : 


Replacing y(0) by its value, we get 
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By = oet 0 D(2) 


h>1 


eas Eee. taer-ny eye 


h>1 k>0 k>0 h>1 


a(X) = (ve - 1) = 


We deduce from here that 


k hk 
Vk>0  y(k) = (cet - De ai 


The above formula is a genuine probability distribution on the non-negative integers, 
indeed all these numbers add up to one, as can be seen by replacing X by | in the 


equality 
ax 


g(X) = (ve “- Ny . 


We call this distribution the quasispecies distribution. 


Definition 7.1 Let a, a be such that ce“ > 1. The distribution of the quasispecies 
of parameters o and a is the distribution Q(c, a) on the non-negative integers defined 
by 


ae h* 
Vk>0 Q(o,a\(k) = (ce 4 - ) <a 
h>1 
Notation. In the sequel, when the parameters o,a are fixed, we might omit them 
from the notation, and we write simply Q or Q(k) instead of Q(c, a) or Q(c, a)(k). 


The quasispecies distribution Q(c, a) can be expressed in terms of the polyloga- 
rithm or Jonquiére’s function. Let s, z € C, with |z| < 1. The polylogarithm of order 
s and argument z is defined by 


Li,(z) = ~— ch 


h>1 


In view of this definition, 


k 
Vk>0 Q(c,a\(k) = (ce* - you(-), 


Eigen described a quasispecies as a population of individuals having a positive 
concentration of the master sequence along with a cloud of mutants. We now have 
an explicit formula for the concentrations of the master sequence and the different 
mutant classes in Eigen’s original quasispecies model. Thanks to this formula, we can 
easily draw the quasispecies distribution and study its dependence on the parameters 
oa: see figures 7.1 and 7.2. In the quasispecies literature, this was previously done 
by integrating numerically the differential system (see for instance [30], figure 3, 
p.9). Besides, we can now perform simple theoretical computations with the help of 
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Fig. 7.1 The quasispecies distribution as a function of a for 7 = 5 
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Fig. 7.2. The quasispecies distribution as a function of a for 7 = 10° 
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this distribution. For instance, the mutation rate at which the concentration of the 
master sequence becomes less than 1/2 is 


20 
a = In( ). 
atl 


The mutation rate at which the concentration of the first Hamming class becomes 
larger than the concentration of the master sequence is a = 1 — 1/c. 


Chapter 8 ® | 
Probabilistic Derivation ome 


The formula for the quasispecies distribution given in definition 7.1 has been obtained 
by writing the limiting quasispecies equation in the long chain regime, and then 
solving the infinite triangular system (7.1) with the help of generating functions. In 
this chapter, we take an alternative road. The quasispecies distribution describes the 
stable equilibrium of an infinite population in the long chain regime. We consider 
first the equilibrium states of the finite population models introduced in chapters 3 
and 4. These equilibrium states could not be described by a simple formula, yet 
in the limit where the population size becomes infinite, they corresponded to the 
normalized Perron—Frobenius eigenvector c* of the matrix W defined by 


Vu,v € E W(u, v) = A(u)M(u, v). 


As we have seen in part I, the normalized Perron—Frobenius eigenvector c* of W is 
the only solution of the fundamental equation (2.1) under the constraint (2.2). We 
shall study the asymptotic behavior of c* in the long chain regime. To achieve this 
goal, we rely on the probabilistic representation of the Perron—Frobenius eigenvector 
in terms of a stopped random walk given in section 5.1. Not only does this technique 
provide another method for deriving the quasispecies distribution, but it yields a 
simple and natural probabilistic representation of the quasispecies distribution in 
terms of the Poisson random walk on the integers. 


8.1 Asymptotics of c* 


We consider here the solution c* of the quasispecies equation on the sharp peak 
landscape, and we study its asymptotic behavior in the long chain regime. We prove 
that it converges towards the solution of the infinite triangular system (7.1). We work 
in the framework described in chapter 6. So the space of genotypes is E = A*, the 
set of the sequences of length ¢ over the alphabet A, the mutation matrix M(u, v) 
is the one defined in section 6.1, while the fitness function A is the one defined in 
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54 8 Probabilistic Derivation 
section 6.2. Finally, the mean reproduction matrix W is defined by 
Vu,v € E W(u, v) = A(u)M(u, v). 


As explained in section 6.3, when working with the sharp peak landscape, we can 
lump together the states belonging to the same Hamming class. So we consider the 
positive matrix Wy defined by 


Was) = An@)MaG/), 0<ij<, 


where Ay and My are defined in section 6.3. We denote by CH the normalized 
Perron—Frobenius eigenvector of Wz, which is the normalized solution of the sys- 
tem (6.1). Using lemma 6.1, we see that c;, is the lumped version of c*: 


vie {0,...,€} ci) = Do, c*(u). (8.1) 


ucH; 


Moreover the Perron—Frobenius eigenvalues of the matrices W and Wy coincide. 
We denote by 2 their common value. We prove next that c7, converges towards the 
quasispecies distribution in the long chain regime. 


Theorem 8.1 In the long chain regime € > o«, q — 0 and €q - a, we have the 
following dichotomy: 

eifoe*% <1,thenAa—1, Ch — 0. 

eifoe* > 1, thenA > oe™, cy, > Q(a, a). 


Proof We recall that the Perron—Frobenius eigenvalue A is equal to the mean fitness 
of the population at equilibrium, that is, 


A = ocy(0) +1-c7,(0). 


In particular, A > 1. Up to the extraction of a subsequence, we can suppose that the 
following limits exist: 


A= li a, “(k) = li Wk), k2>O. 
ergo? TNO 5 Og a) 
€q-a €q-a 


Writing down the first equation of the system Ae = (cr,)' W, we see that 
ocy,(0)Mx(0,0) < Aci (0) < oc7,(0)Me(0, 0) + es Muli,0). 
We conclude that A* > max(1, ce“). As we have already pointed out, 
Aa = (o - 1)c7,(0) + 1. 
Taking the limits in the above two equations, we deduce that 


a* = (a -1)n*(0) +1, A“n* (0) = on (Oe“. 
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Since A* > max(1, ce~“%), we conclude that: 
elf ce? < 1, then A* = 1 and 7*(0) = 0. 
elf ae% > 1, then 


oe%-] 


A* = ce and 7°(0) = 


a-1l 


Finally, writing down the k-th equation of the system Ae) = Gy W, we see that 


k 
oi (0)Mn(0, k) + )\ ci (Malik) < acy (k) < 
i=1 
k 
oc%,(0)My (0, k) + 3 cy @Mu(i,k) + max Mui, k). 


i=1 


Taking the limit, we recover the recurrence relation (7.1), and the result of the 
theorem follows. oO 


8.2 Limit of the Mutant Walk Representation 


A consequence of theorem 8.1 is that the quasispecies distribution can be obtained 
as the simple limit of the Perron—Frobenius eigenvector c;,. The vector c;, is the 
lumped version of the vector c* (see formula (8.1)). The vector c* in turn admits a 
natural probabilistic representation in terms of the mutant walk, built with the help of 
the matrices W and M, as explained in section 5.1. This probabilistic representation 
is our starting point, we try next to understand its asymptotic behavior in the long 
chain regime. 

We consider a Markov chain (W,,),>0 on A® which has M for its transition matrix. 
Let t* be the time of the first return to w*, defined by 


T aint {n> 12W,yow'}. 


We apply proposition 5.1 in this specific setting, with w* for the starting point. The 
equation (5.1) satisfied by the eigenvalue 2 becomes 


Ey (ac) Sa, (8.2) 


The representation formula (5.2) for the left Perron—Frobenius eigenvector c° of the 
matrix W satisfying c°(w*) = 1 becomes 


WweA cu) = aap) (Hoan). (8.3) 


n=0 
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Finally, from corollary 5.2, we have 


c*(w") = ee (8.4) 


T 


Ew (>) (a"o)] 


n=1 


The three equations (8.2), (8.3), (8.4) completely determine the eigenvalue 2 and 
the normalized Perron—Frobenius eigenvector c*. We shall rely on these equations 
to study the asymptotic behavior of c* and cj, in the long chain regime 


E> ~&, q- 0, €q- a. 


Our technique consists in conditioning on the value of the random time t* in order 
to rewrite the expectations as infinite sums. The three equations (8.2), (8.3), (8.4) 
become then 


2 ae =n) = 1, (8.5) 
wWweA®  c%(u) = >) SP >n,Wy =u), (8.6) 
n>0 
-l 
See » <P" . »| (8.7) 
n>1 


Our next goal is to examine the behavior of these formulas in the long chain regime. 
Let us start by examining the asymptotic behavior of the distribution of t*. First, we 
have 

P(t* =1) = M(w*,w*) = (1-g)) > e%.~ 


Second, for any fixed n > 2, we have 
P(t* =n) = P(t* =n, W, # uw") < P(W, = uw", W, # w"*) 
< P(3i E{1,...,€} Wil) #w*(i), Wald) = wi) 


P there exists ani in { 1,...,€} such that, on the i-th letter, 
one mutation occurs at time 1 and another one before time n] * 


This last probability is less than or equal to £nq*, which goes to 0. We conclude that 
Vn = 2 P(t* =n) < €ngq?. 
Summing the previous inequality, we get furthermore 


Yn>2  P(2<t* <n) < en’¢’. (8.8) 
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If we fix n > 2, these bounds go to 0 in the long chain regime. In fact, once a 
mutation has occurred, it is very unlikely to be reversed before other mutations 
appear elsewhere. In the long chain regime, the distribution of the random time t* 
converges towards e~%6, + (1 — e~“)6400. Let us examine now the behavior of J. 
Equation (8.5) readily implies that 


o € 
—(|- < 1 
ri q) = 1, 


whence 


liminf A > ce“. 
€—00,q0 
lq-a 


From now onwards, we consider only the case where oe~* > 1. The above inequality 
implies that, in the long chain regime, 


1 
Az gil + oe) Sol. 


Therefore the series appearing in formulas (8.5), (8.6), (8.7) are dominated by a con- 
vergent geometric series. To obtain their limits in the long chain regime, we can thus 
take the limits inside the sums, so we need only to examine the asymptotic behavior 
of each term inside the sums. For instance, passing to the limit in formula (8.5) yields 
directly 

lm Az=oe"%. 

€00,q0 
(qa 


This furnishes an alternative proof for the convergence of the Perron—Frobenius 
eigenvalue J. In a similar vein, passing to the limit in formula (8.7) yields 


-1 
o o = 
li *(w") = l1-e% = ——_,, 8.9 
ge ee oe4 * 2, (wena ea a-1 ee) 


(0,q0 
lqoa 

and we recover the limiting value Q(c,a)(0) for the concentration of the master 
sequence in the quasispecies distribution. To obtain the values for the other classes, 
we use formula (8.6). This is more delicate, because in the long chain regime, the 
probability of a fixed element u ¢ A® distinct from w* vanishes. Therefore we 
fix k > 1 and we consider the probability mass of the k-th Hamming class, or 
equivalently we lump together the elements of A as in formula (8.1), so we study 


Bee a OW) 3 Face 
n= >, ae >, a(t" > Wn € Ae) - (8.10) 
ucH, n>0 
Let us focus on the probability inside the sum. Let k,n > 1 be fixed. Using the 
inequality (8.8), we have 
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P(t* >n,Wn € Hy) = P(t* > 1, Wn € Hy, Wi # w*) 
= P(W, € Hy, W, #w*) — P(2 <t* <n, Wy, € Hy, Wi # w") 
= P(W, € Hy, Wi #w") + O(€n*q’) . 


Furthermore, we have 


P(W,, € Hx, W, + w*) = P(W,, € Hk) - P(W,, € Fk, Wi = w*) 
= P(Wn € Hk) - (1-9) P(Wa-1 € Hy) - 


Substituting this into in cf, (k) and reindexing the sums, we obtain 


cj(k) = d (Frm, € Hy) — (1-4) P(Wn-1 € Fh) +0(" | 


An 
n>1 


= (Z(1- 52 <2" \p (Wr EH) + +0{ or). 


By theorem 8.1, we have 


ak) _ 
QO) sq70 


€qoa 


Cr H (k ys 
Interchanging the limit and the summation, we obtain 


Q(k) 
lim P(W,, € H, 8.11 
Q(0) = eer (ce =, a ( ee ae 
provided that the limit inside the sum exists. Using the expression of Q(0), and noting 
that the formula holds also for k = 0, we conclude that 


oe *-1 
VkK>0 Q(k) = 2, came -tim, P(W, € Hk). (8.12) 
tq-a 


To study the limit inside the sum, we shall keep track of the mutation events with 
the help of the following random binary array. For 1 < i < nand1 <j < €, we 
set E;; = 1 if a mutation occurs on the j-th digit at time 7, and 0 otherwise. More 
precisely, we set 


Ee. = 1 if Wii) # Wii), 
i 0 if Wj) =W,-1(/). 
The random variables EF; ; are iid. Bernoulli variables with parameter q. Let 6, be 


the event that there is at most one mutation in each column of the array (F;,;,1 < 
i<nl<j<0),ie., 
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Gi {Vie (1,...€} yy Ey <1. 


In the long chain regime, for fixed n, the mutation array is a very long flat rectangle. 
Typically, there are a few mutations on each line, but at most one mutation in each 
column. More precisely, we have the upper bound 


P(E‘) < P(aje {1,-. 6} Fi 2671 cand En,j = Ea = 1) < tng? 


and therefore the probability of the event 6, goes to 1. On the event &,, the Hamming 
class of W,, is simply equal to the total number of mutations that have occurred until 
time n. Indeed, since the mutations occur on different columns, no mutation can be 
reversed. This provides a simple way to compute the probability that W,, is in Hi: 


P(Wn € Hy) = P(Wn € Hy, En) + O(€n*q’) 
=P >.) by Sh En) + O(€n?q?) 
l<i<n1<j<€ 


=P( > > By = k) + O(€n?q’). (8.13) 


l<i<n1<j< 


The distribution of the double sum in the probability is the binomial Bin(né, g) with 

parameters nf, q. In the long chain regime, for n fixed, we have nf — co, g — 0 and 

ntq — na. Weare therefore in the regime where the binomial distribution Bin(n@, q) 

converges towards the Poisson distribution P(na) of parameter na. We conclude that 
lim P(W, € Ak) = _ jim P(Bin(ng, €) = k) = P(P(na) =k). 


l00,q0 
fq-a fqra 


Substituting this into equation (8.12), we obtain finally 
(na) ak n 


Q(k) = Lig a aor gr eel) Di 


We have recovered the formula obtained in section 7.2 with the method of the 
generating function. 


8.3 The Poisson Random Walk 


In this section, we adopt a different strategy. Instead of looking for a simple explicit 
formula, we try to describe the quasispecies distribution as a functional of a limiting 
probabilistic object. To this end, we shall evaluate the limit of P(W,, € Hk) in anew 
way. For n > 1, let us denote by N,, the number of mutations occurring in W,, at time 
n, i€., 
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Nn = |{1 sj < €: Wa(/) # Wri) } I. 


On the event &,, the sequence W,, is in Hi; if and only if the total number of mutations 
having occurred on W,, is equal to k, thus 


P(Wh € Hy, En) = P(N +++++ Ny =k, En). 


The random variables Nj,..., N, are i.i.d. with common distribution the binomial 
with parameters €, g. We are in the regime where the binomial distribution Bin(€, q) 
converges towards the Poisson distribution P (a). Let (Y;)n>1 be a sequence of i.i.d. 
random variables with common distribution P(a). We consider the random walk on 
the non-negative integers, given by So = 0 and 


Vn > 1 Sy, = Yi, +---+Y%,. 
From the previous discussion, we conclude that 


_jim P(Wn € Ak) = P(Sn =k). 
€qroa 


Plugging this into the representation formula (8.12), we have 


oe%-] 


(oer P(Sn =k). 


Qo, a)(k) = >? 


n>1 


The factor in front of the probability can naturally be interpreted as a killing factor. 
More precisely, let t be a random variable, which is independent of the Poisson 
random walk, with geometric distribution of parameter 1 — (ce~*)"I, i.e., 


1 n-1| 
Vn >=1 P(r =n) = [ — . 
oe 4 


The expectation of f is given by E(t) = oe~*/(ae~*%—1). The formula for Q(c, a)(k) 
can now be rewritten as 


Qlo.aKk) = = Y) PCr > m)P(Sy =k) 
n>1 
1 vat 1 = 
= EQ ia Cee =8 = Be) Dy dyP(Sn=b r= 2) 


We have therefore proved the following result. 


Proposition 8.2 Let a, a be such that ce~% > 1. Let t be a random variable, which 
is independent of the Poisson random walk, with geometric distribution of parameter 
1 -(ce~*)"!. The quasispecies distribution Q(c, a) is equal to the mean empirical 
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distribution of the Poisson random walk between times | and T, i.e., 


Vk >0 Q(o.aXh) = 5B D isyan) (8.14) 


n=1 


This probabilistic construction provides the following intuitive picture for the struc- 
ture of the quasispecies. The evolution of the genotype along a lineage is modeled 
by a Poisson random walk in the genotype space, starting from the master sequence. 
Because of the presence of the master sequence in the population, the lineages are 
bound to become extinct. The time t models the survival time of a lineage. 


8.4 Formal Derivation 


The formula of proposition 8.2 has been proved after a rather tedious computation. 
We explain another way to guess and possibly prove this formula. Our starting point 
is again the probabilistic representation for the solution of equation (2.1) in terms of 
a stopped random walk given in section 5.1. In contrast to section 8.2, we write this 
formula directly for the random walk on the Hamming classes. More precisely, we 
take for E the set of the Hamming classes { 0,...,€}, for A the lumped sharp peak 
fitness function Az, and for M the mutation matrix My given by the difference of two 
independent binomial laws, as in section 6.4. Define the Markov chain (Z,,),>0 to be 
the mutant walk on { 0,..., € } with transition matrix My. Applying the formula (5.3) 
in this particular context, we obtain that the equilibrium concentration of the master 
sequence is equal to 


al 5, (1 {Za=Kye ” rH an) 


n=0 i=0 


al Sy (a ry ano) 
i=0 


n=0 


VkK>0 c(h) = 


(8.15) 


In the case of the sharp peak landscape and because To is the hitting time of the 
master sequence, the product in the expectation reduces to 


n-1| ‘ 
1 ifn=0 
| | Au (Zi) = ; : 
> o ifl<n<t 


and the formula (8.15) can be rewritten as 
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To-1 
Eo(1 (2-43) + rt > (ae) 


n=1 
T-1 

1+ | >, a 
n=1 


Let us examine what becomes of this formula in the long chain regime 


Vk>0  c*(k) = 


(8.16) 


Ero, q- 0, €q- a. 


We know from theorem 8.1 that the Perron—Frobenius eigenvalue A converges to- 
wards the maximum between oe“ and 1. The asymptotics of the transition matrix 
My, of the Markov chain (Z,,),e have been computed in lemma 6.2. This con- 
vergence suggests a natural candidate for the limiting process, namely the Poisson 
random walk that we define next. Let (Y,,),>1 be a sequence of i.i.d. random variables 
with distribution the Poisson law P(a) of parameter a. We define 


Vn> 1 Sy, = Yi +---+Y%,. 


On a fixed time interval, the process (Z,)ne~ converges in distribution towards 
the Poisson random walk (S,,)nex. Yet the matter might be much more delicate 
for functionals of the whole process, like the one appearing in the formula (8.15). 
Indeed, such functionals involve arbitrarily long time intervals, and large deviations 
events might contribute substantially to the expectation, although they cannot be 
apprehended in the limiting process. A rigorous asymptotic analysis would require 
a lot of additional work, and it is not indispensable at this point since we already 
rigorously derived the formula for the quasispecies distribution. Our goal here is 
to explain another strategy leading to this formula, which has the advantage of 
revealing a natural probabilistic representation of the quasispecies. So we refrain 
from providing all the required mathematical details, and we only perform a formal 
passage to the limit in formula (8.16), that is, we guess the natural limiting candidate 
for each object, and we write the corresponding limiting formula. In the long chain 
regime, the first guess when passing to the limit in formula (8.16) is that c*(k) 
converges towards 


Eo(1 (sy=43) + zp) (1is,<1) ((re) ' 7) 


n=1 


1+ zp) (ce) Vv "| 


n=1 


(8.17) 


The random time To is now the first time of return to 0 for the Poisson random walk. 
Yet the Poisson random walk (S,,)nen has non-negative increments, so either 5; = 0 
and to = 1, or Sj > 0 and to = +00. We consider now two cases: 

elfoe “ < 1, then formula (8.17) is equal to 0. 
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e If ce * > 1, then formula (8.17) is equal to 


o 
Eo (1 (sy=43) + al 5) (1spo)t0 7) 


n>1 


1+(-e*) )| ae 


n>1 


A standard computation yields that the denominator is equal to 


ao-1l 


oe4—]- 


For the case k = 0, only the first term in the numerator remains, and we recover the 
expression for the concentration of the master sequence found in section 7.1. In the 
case where k > 1, the trajectories which contribute to the expectation are such that 
S; > 0 and tT = +00. Only the second expectation in the numerator remains and it 
is equal to 


o 
Eo Y tscotisn | =>) 


n=l n=l (ce-#)" 
-) esp (P(Sn =k) - P(S1 =0, S, = 4) 
= cea (P(s, Hh) 2 *P(S 4 < k)) 


Putting together the previous formulas, we expect from the formal passage to the 
limit that the quasispecies distribution is given by 

Vk>0 Qc, a)(k) = (ce - 1) > ——__ P(Sn = k). (8.18) 
n>1 (re 3 
Now the random variable S,, is the sum of n i.i.d. random variables with distribution 
the Poisson law P(a) of parameter a, thus its distribution is the Poisson law P(na), 
whence 
=a (na)* 

kl” 

and plugging this formula into (8.18), we recover the expression for the concentration 
of the Hamming class k found in section 7.2. 


P(Sn =k) =e 


Chapter 9 ® | 
Summation of the Series eertem 


So far, we have obtained an explicit formula for the quasispecies distribution in terms 
of an infinite series. Our goal in this chapter is to compute the sum of this series and 
to obtain finite formulas for the quasispecies distribution. Our starting point is the 


following formula: 
ey Ei; = k 


i ge?—-1 l<i<n1<j<€ 
Q(c,a)(k) = lim op <isn1<js 9] 
( ( ) rk 2, (oe-¢)" by Ei rae l<j <e ( ) 
l<i<n 


where (£;,;,1 <i < n,1 < j < €) are iid. Bernoulli variables with parameter gq. 
This formula emerges out of the computations performed in chapter 8. It is obtained 
as follows. Let &, be the event that there is at most one mutation occurring in each 
column of the array (£;,;,1 < i < n,1 < j < €). The event &, corresponds to 
the second line in the probability above. The main point is that, for n fixed, in the 
long chain regime, the event &, is a typical event, whose probability tends to 1. In 
formula (8.13), we intersect the event appearing in the last probability with 6,,. Since 
the probability of &,, tends to 1, we can then substitute (8.13) into the formula (8.11). 
We then take the limit out of the infinite sum (thus reversing the steps which led 
to this formula!), and we finally use formula (8.9) to get the correct normalization 
factor. Now that formula (9.1) is established, the strategy consists in computing in 
various ways the probability inside the sum. We shall express this probability as a 
finite sum involving classical combinatorial quantities, namely the Eulerian numbers 
and the Stirling numbers. After exchanging the order of the summations, we shall 
perform the infinite summation with respect to n and we finally obtain finite formulas 
for the quasispecies distribution. More precisely, we write 


> Bes 


P l<i<n1<j<l = gel — gy" N(n,€, k), 
», Eps, S54 
l<i<n 
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where NV (n, €, k) is the number of matrices of size n x € containing k ones and nf —k 
zeros, with the additional constraint that there is at most one one in each column. 
Plugging this into formula (9.1), we get 


Q(k) =, lim | dG e ar g(1—- qv * NW). (9.2) 


fqoa 


Our next task is to compute N(n, €, k). Perhaps the most natural way consists in first 
choosing k columns among the ¢ columns, which will be the columns containing 
exactly one one. Once these columns are given, the number of choices for placing 
the ones in the k columns is n*. Therefore 


N(n, €,k) = (ic)! ; (9.3) 


Moreover we have 


ae = gona (aye 
k 


lim gk - il 


oe 
Substituting this into formula (9.2) yields again the formula obtained in section 7.2. 
However, in order to perform the summation in n, we shall compute N(n, €, k) in a 
different way. 


9.1 Stirling Numbers 


The factor n* in formula (9.3) counts the number of ways of putting k ones in k fixed 
columns of height n. It is also the number of maps from {1,...,k } to {1,...,n}. 
Let us denote by ¥(n, k) the collection of these maps. We shall enumerate the maps 
of ¥(n, k) according to their range. More precisely, we partition F (n, k) as follows: 


F (n, k) \f € F(k,n) : range f = R} 


rc 


LJ | f € F (kn): range f = R}. 
I<hsk RC{1,...n} 
|R|=h 


Taking the cardinals, we get 
|F(n, k)| = = by ny [eRe aay =) 


l<h<k Rc{1,.. 
|R|= i 
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The cardinal inside the sum depends only on A, not on the specific choice of R such 
that |R| = h; it is equal to the cardinal of the collection S(k, A) of the surjective maps 
from {1,...,k} to{1,...,A}. We thus have 


IFabl= > (;) seo. 
1<h<k 


Given a map f in S(k,h), we look at the pre-images of 1,...,/ through f. This 
way we obtain an ordered partition of {1,...,k } into A non-empty subsets. Yet the 
number of such partitions, without order, is given by the Stirling numbers. 


Definition 9.1 For 0 < h < k, the Stirling number {Ft is defined as the number of 
partitions of a set of cardinality k into h non-empty subsets. 


We have therefore r 
|S(k, h)| = mt ; 
Putting together the previous equalities, we conclude that 
nk = 3 (;) A! fe (9.4) 
= h h 


Substituting this identity into formulas (9.3) and (9.2), we get 


= li veel key _ yee e n\ {k 
a0 es toate are) 3G) 


€qr-a 1<h<k 
= La n k 

n>1 1<h<k 
k 

a k 1 [n 

= “4 — 1) h! — : 

eA 2: ep) on 4 

1<h<k n>h 


The generating function of the binomial coefficients is computed in the next lemma. 


Lemma 9.2 For any x in [0,1], we have 
Vh > 0 Ne 
= Dalal ~ (L—xjrtl 
n>h 


Proof The case h = 0 corresponds to the standard geometric series: 


n>0 


For h > 1, we differentiate this identity h times, and we get 
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h} 
n-h _ 
Si an-l-ht Dx ~ Gxt’ 
n>h 
Multiplying by x’ and dividing by h!, we obtain the desired identity. oO 


Substituting the identity of lemma 9.2 into the previous expression for Q(k), we can 
perform the summation in 7 and we get finally 


Vkz1 Qk) = (we*-IE eat . (9.5) 


9.2 Eulerian Numbers 


In order to obtain yet another formula for the quasispecies distribution, we shall 
compute the factor n* in formula (9.3) using a more elaborate method. This factor is 
the number of matrices of size n x k containing exactly one one in each column. Let 
M(n, k) be the collection of these matrices and let M =(M;;,1 <i<n,1 <j <k) 
be one such matrix. We screen the matrix M row by row from the top to the bottom 
and we record the sequence of the indices of the columns containing the ones. For 
1 <j < k, let us denote by no(j) the index of the column where the j-th one is 
encountered during the screening process. The resulting sequence no(1),..., no(k) 
is a permutation of 1,..., k. We recall that (7, j) is less than or equal to (i’, j’) for the 
lexicographic order if and only if i < i’ or (i = i’ and j < j’). Let us denote by 


(i1, ji), ---» Wes Je) 


the positions of the k ones in the matrix M, arranged according to the lexicographic 
order. Then the sequence no(1),..., n0(k) coincides with the sequence j),..., jx. 
The formula displayed below shows a matrix in M(4, 7), along with the permutation 
and the sequence of indices associated to it: 


(3245167) 
0010000 
0101100 (1,3), 
1000000 ~°  (2,2),(2,4), (2,5), Os) 
0000011 (3, 1), 
(4, 6), (4,7) 


We have seen that each matrix M € M(n,k) uniquely determines one permutation 
no in Sx, the collection of the permutations of k elements. Conversely, let us fix a 
permutation o in S;. We would like to compute the number of matrices M whose 
associated permutation is o. These matrices are determined by the location of the 
ones, therefore their number is the number of sequences i),...,i, in {1,...,n} such 
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that (i;, 7(1)), ..., (i, o(k)) is non-decreasing for the lexicographic order. We shall 
rewrite this condition in terms of the ascents and the descents of the permutation o. 


Definition 9.3 We say that the permutation o has an ascent at position h if o(h) < 
o-(h + 1) and that it has a descent at position h if o(h) > o(h + 1). 


The condition that (i), 7(1)),..., (iz, 7(k)) is non-decreasing for the lexicographic 
order can be rewritten as: 


el<ip<-:-<k <n; 


e if o has a descent ath € {1,...,k —1}, then iy, < in4y. CD 


We perform a change of variables in order to simplify these conditions. For h € 
{1,...,k }, we define asc(c, h) as the number of ascents for the /h first values, i.e., 


asc(a, h) = I <h:ali)< oi + 1)}], 


and we set 
i, = in + asc(o,h). (9.8) 


In the example given in formula (9.6), we have o = (3245167); the ascents occur at 
the positions (2, 3, 5, 6), and therefore 


asc(o,-) = 0,0, 1, 2, 2, 3,4 
+ i. = 1,2,2,2,3,4,4 


i = 1,2,3,4,5,7,8 


The point is that this new sequence ij, ...,i, is strictly increasing. Indeed, if o has 
a descent at h, then iy, < i,41. If o has an ascent at h, then asc(c, h) < asc(a, h + 1) 
and 

ing, = inst tasc(o,h+1) > in t+asc(o,h) = i, . 


After performing the change of yenebles the two conditions (9.7) reduce to one 
single condition for the sequence ij, ...,i,, namely 
1 <i, <---<i, <n+asc(o,k). (9.9) 


The permutation o being fixed, the transformation given in formula (9.8) provides 
a one-to-one correspondence between the sequences ij,..., ix satisfying (9.7) and 
the sequences Bis bi ely satisfying (9.9). Now the fuieAber of sequences Es ks gts 
satisfying (9.9) is siniply equal to the number of ways to choose k elements among 


n +asc(o, k). We conclude that the number of matrices in the collection M(n, k) is 


equal to 
oe a n+h 
= > Jey Dy (Goa 


TES h=0 oS 
asc(o",k)=h 
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We shall finally rewrite this formula with the help of the Eulerian numbers, which 
are in fact equal to the number of terms in the inner sum. 


Definition 9.4 For 0 < h < k, the Eulerian number a is the number of permuta- 
tions of S, having exactly h ascents. 


From equation (9.10), we see that 


k-1 
gs Cy; "): (9.11) 


This is a classical identity, known as Worpitzky’s identity (see [74], Corollary 1.2 
or [61], Proposition 5.84). This identity is attributed to Li Shan-Lan in 1867 by 
Knuth (see [59], section 5.1.3), indeed Worpitzky’s paper was published later in the 
Occident, in 1883. The idea of the proof above is presented in Knuth’s book [59] 
and in the hint to problem 1.13 of [74]. 

Substituting Worpitzky’s identity into formulas (9.3) and (9.2), we get 


Qk) = tim, D) a Ha oret(’) >; ()("2") 


€qoa n>1 


1 k\[n+h 

= DED Os | 
k! Ae ds h k 

( a p™ y. (i) 1 es 

= (ce * — 1)— — 
ki h=0 h she k 
ak 1 x h 1 [n 
= (ve -05 yi )e Dale: 
h=0 n>k 


Using the formula for the generating function of the binomial coefficients given in 
lemma 9.2, we obtain 


ak (i) 
Vk>1  Q(k) = (ve4* - 1)——__ Bae (9.12) 


9.3 Combinatorial Identities 


As a byproduct of the previous proofs, we obtain several combinatorial identities. 
The equality between the two finite formulas (9.5) and (9.12) for the quasispecies 
distribution yields the following identity: 
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k k-1 
k k 
h! ~1)** = te 9.13 
2, { iit ) d,{ NG (9.13) 


For 0 = 1, we see that the sum of the Eulerian numbers is equal to k!, which was 
to be expected! Identifying the coefficients of 7” in (9.13), we recover one of the 
classical formulas linking the Stirling and the Eulerian numbers: 


i) Setsfoor 


Similarly, if we develop the right-hand side in powers of 7 — 1 and we identify the 
coefficients of (a — 1)", we get 


fs} = 55(4)(}): 


This formula is classical, see for instance [61], Proposition 5.83. Equating the two 
expressions for n* of formulas (9.4) and (9.11), we get a more symmetric formula: 


# = SU) = Sl") 


h=1 h=0 


Chapter 10 ® | 
Error Threshold in Infinite Populations aa 


In the previous chapters, we considered the long chain regime 
€— +00, q- 0, €q > a €]0, +00[ . 


Passing to the limit on the quasispecies equation (2.1), we obtained a limiting 
system of equations (10.2), which we were able to solve. In part I, we saw also that 
the quasispecies equation describes the equilibrium of several infinite population 
models and finite population models when the size of the population is sent to ov. It 
is therefore natural to ask how the limiting system (10.2) and the population models 
introduced in chapter 3 and 4 relate to each other. In this chapter we explore this 
relationship for the infinite population models. The case of the finite population 
models, which is much richer, since it depends on the interplay between the infinite 
population limit and the long chain regime, will be the topic of parts III and IV. 


10.1 The Moran—Kingman Model 


We consider the Moran—Kingman model introduced in section 3.1, where the geno- 
type space E is taken to be the set of the Hamming classes {0,...,€}, along with 
the sharp peak fitness function Ay and the lumped mutation matrix My. We define 
S® as 
Ss {ce [0,1": }' efi) < i}, 
ieN 

Passing to the limit on the mapping ® defined by (3.2), we obtain a mapping ®. 
from S® to S*, which is given by 


a 


(k —i)! 
(o — 1)c(0O) + 1 


by c()An(e4 


O<i<k 


VoES™ VK>Z0  ®i(c)(k) = 
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We can therefore define the limit Moran—Kingman model as the dynamical system 
on S® associated to the iteration of the map ©: 


Yn>0 — Cast = BeolCn)- (10.1) 


The equations for the fixed points of the limiting dynamical system coincide with 
the limit of the fundamental system of equations (7.1). The asymptotic behavior in 
the long chain regime of the sequence of the iterates (Cy)n>0 is given by the next 
theorem. Recall that the quasispecies distribution Q(c, a) is given in definition 7.1. 


Theorem 10.1 We have the following dichotomy. 
e If ce% < I, the dynamical system (10.1) has a unique fixed point, 0, and for any 
initial condition co € S™, 


Vk>0 — lime,(k) = 0. 


e If ce~* > 1, the dynamical system (10.1) has two fixed points, 0 and Q(c,a). 
Moreover, for any initial condition co € S*, we have 


c(0)>O = Vk=0 lim cy(k) = Q(a,a)(k), 
c(0)=0 => Vk=E0  lime,(k) =0. 


Proof The dynamical system (10.1) involves an infinite number of equations. Nev- 
ertheless, we remark that the dynamics of (c,(k)),>0 depends only on the variables 
Cy(0), ..., Cn(k). Let us consider first the case k = 0. We have then 


en+1(0) = DF (cn(0)), 


where the function ®> : [0, 1] — [0, 1] is defined by 


aoe *x 


Vx € [0,1] SS ae ar | 


We have already studied the fixed points of this function. Indeed, the fixed point 
equation ®)’(x) = x is equation (7.1) for k = 0. Thus, we know that if ~e™“ < 
1, its only fixed point is 0, while if me~* > 1, its only fixed points are 0 and 
(ae~“ — 1)/(o — 1). The behavior of the iterates of ®>° is described schematically 
in figures 10.1 and 10.2. 

We easily see from the graphs that if me“ < 1 then (®})"(x) — 0, and if 
ae“ > 1 and x > 0 then (®}°)"(x) — Q(0). In order to prove this rigorously, we 
compute the derivative of the function Oy: 


oe*((o - 1)x +1) -oe “(a - 1)x oe 4 
0-)'(x) = > HJ. = ———_... 
oe ((o -— 1)x + 1) ((o - Ix +1)? 
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Fig. 10.1 Convergence to 0 in the subcritical case 


Fig. 10.2 Convergence to Q(0) in the supercritical case 


The derivative (®¢°)’(x) is positive for all x € [0,1], and the function ®>°(x) is thus 
increasing. Moreover, (®¢)’(0) = oe and ®P(1) = e* < 1. Assume first that 
ae“ < 1. Since (O})’(0) < 1 and ®}(1) < 1, noting that the function ©} (x) does 
not intersect the line y = x for any x € JO, 1], we conclude that ®}’(x) < x for every 
x €]0, 1]. Thus, the sequence ((®})"(x))n>0 is decreasing and bounded from below 
by 0. In addition, its limit must satisfy the fixed point equation for OF. We conclude 
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therefore that 
Vx € [0,1] lim (OF )"(x) = 0. 


Assume next that oe~% > 1, and note that 

-a_y 

1) 
a1 

Since (Of )’(0) > 1, ®YU) < 1 and ®P(Q(O)) = Q(O), we have OP (x) > x for 
x €]0,Q(0)[ and OP(x) < x for x €]Q(0), 1]. Thus, for any x € ]0,Q(0)[, the 
sequence ((®/)"(x))n>0 is increasing and bounded from above by Q(0), moreover, 
its limit must satisfy the fixed point equation for ®,’. We conclude therefore that 


Vx €]0,Q(0)[ Jim (®q')"(x) = Q(0). 


Similar arguments show that, for x € ]Q(0), 1], the sequence ((®}°)"(x))n>0 also 
converges to Q(0). It remains to study the convergence of the other coordinates. Let 
us assume that oe~* > 1 and that co(0) > 0 (the remaining cases can be analyzed in 
a similar manner). We do the proof by induction on k. We already know the result 
to be true for k = 0. Fix k > 1 and assume that the result holds for the coordinates 
0,...,k — 1. By the induction hypothesis 


Jim (n(0),---,¢n(k - 1D) = (QO)... Qk = 1). 


The sequence (c,(k))n>0 takes its values in the compact set [0, 1]. Thus, up to the 
extraction of a subsequence, we can suppose that it converges. Let us denote its limit 
by Coo(k). The value coo(k) must then satisfy the equation 


rane —+ +)! We" Gy = = Coo(k)e~* 
Coo(k) = 


a — —_ +1 


We conclude that c..(k) is equal to Q(x) and that the sequence (c,;(k))n>0 converges 
towards Q(k) as n goes to oo. oO 


10.2 The Eigen Model 


We consider the Eigen model introduced in section 4.1, where the genotype space E 
is taken to be the set of the Hamming classes { 0,..., €}, along with the sharp peak 
fitness function Ay and the lumped mutation matrix M7,. Passing to the limit on the 
differential equation (4.2), we obtain the infinite system of differential equations 


10.2 The Eigen Model a. 


k-i 
CD) 2D eldAne* 


O<i<k 


Vk>0 = cr(k)((o — 1)c(0) + 1). 


a 
(k —1) 
(10.2) 
We can thus define a limit Eigen model by this system of differential equations. Once 
again, the equations describing the equilibrium solutions of the limit Eigen model 
coincide with the limit of the fundamental system of equations (7.1). The asymptotic 
behavior of the solution of the differential system (10.2) as t goes to oo is described 
in the next theorem. 
Theorem 10.2 We have the following dichotomy. 
e If ce @ < 1, the system of differential equations (10.2) has a unique equilibrium 
solution, 0, and for any initial condition cy € S®™, 


Vk>0 — lime;(k) = 0. 


e If ce“ > 1, the system of differential equations (10.2) has two equilibrium 
solutions, 0 and Q(c, a). Moreover, for any initial condition cp € S®, we have 


c(0)>O = VkK=ZO jim cr(k) = Q(o, a)(k), 
co(0)=0 = VkK=Z0 lim c;(k) = 0. 
t—oo 
Proof The system of differential equations (10.2) has an infinite number of equa- 


tions. Nevertheless, we remark that the dynamics of (c;(k));+9 depends only on the 
variables (c;(0), ...,c;(k))+>0. Let us consider first the case k = 0. We have then 


dc,(0) 
dt 


= c(0)oe* — c(0)((o — 1)cr(0) + 1). 


We have already studied the equilibrium solutions of this differential equation: if 
oe“ < 1, its only equilibrium solution is 0, while if ~e~* > 1, its equilibrium 
solutions are 0 and Q(0). Let Fo be the mapping from [0, 1] to R defined by 


Vx € [0, 1] Fo(x) = x(ae™* — 1 -(o - 1)x). 


The behavior of the dynamical system x’ = Fo(x) is represented schematically in 
the figures 10.3 and 10.4. It is easy to see from the graphs that if ~e~* < 1 then 
x(t) — 0, while if ce“? > 1 and x(0) > 0 then x(t) > Q(0). In order to prove this 
rigorously, we suppose first that oe~% < 1. In this case, the mapping Fo is negative 
for x € ]0, 1]. Thus, for any co(0) € ]0, 1], the map s € [0, +o0[ / c,(0) is decreasing 
and bounded from below by 0. Therefore it has a limit c* € [0, 1] and 


lim  Fo(cs(0)) = Fo(c"). 


Moreover, sending f to oo in the equality 


,(0) = co(0) + i Fy(c,(0))ds, 
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0 it 


Fig. 10.3 Convergence to 0 in the subcritical case 


Fig. 10.4 Convergence to Q(0) in the supercritical case 


we obtain that a 
c* = co(0) + | Fo(cs(0))ds . 
0 


Yet the integrand is strictly negative. The convergence of the integral implies that 
Fo(c*) = 0, which in turn implies that c* = 0. Suppose next that oe? > 1. In 
this case, the mapping Fo is positive on ]0,Q(0)[ and negative on JQ(0), 1]. An 
argument similar to the previous one shows that, for any co(0) € ]0, 1], the trajectory 
c,(0) converges to Q(0) when f goes to infinity. 
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It remains to study the convergence of the other coordinates. We assume that 
oe~* > 1 and co(0) > 0. The other cases are simpler to deal with. We do the proof 
by induction on k. The case k = 0 has already been settled. Fix k > 1 and assume the 


result to be true for the coordinates 0,..., k — 1. We introduce the following change 
of variables: (i 
. , Cr 
Vi>0 = , 
roe TEP a0) 


Differentiating both sides of this equality, we get a system of differential equations 
for the variables y,(i), i = 0, which reads 


dy; (i) = ee -a 
= oe at oS y,(h)e 


dt l<h<i-1 


qih 
G-m! = y,(ie “(oe = 1), i>l1. 


Moreover we have y,;(0) = 1 for all tf > 0. Let us set a = e~“(o — 1) and define 
Vi>1 Vy(),...,y¥@-1) 20 
i-h 


a 
+ y(hye “— : 
l<h<i-1 (i—h)! 


wi 
i! 


gi(y(1),...,y@-1)) = oe“ 


With this notation, the system of differential equations can be rewritten as 


dyr(i) 


dt = gi(yr(1),.--, = 1)) —ay(i), i2l. 


This system is upper triangular. Once the trajectories 


(y7(), aie Vek = 1))s>0 


are given, the new differential equation for i = k is a linear differential equation of 
the first order, which we can explicitly solve, obtaining 


y(k) = e @ yo(k) + I gx(ys(1), .o ey Vg(k - 1))e 9) ds : 


The first term on the right-hand side of this expression goes to 0 when ¢ goes to 
infinity. For the second one, note that 


cr) cr(k = ») 


ge(ve(l) (1) = Be Se 


which converges towards 


ie ee oe) 
ol a 0 1(0) RE Y(()) 


when t goes to ov, thanks to the induction hypothesis. Let e > 0. The mapping gx 
being continuous, we can choose T large enough so that, for all t > T, 
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ge (ve(1),..-, ye(k =D) ~ gf] < ae. 


Thus, for t > T, 


t * 
| i, gelYs(I)s---vy5(k — Dene ds — Be 
0 (04 


-a@ 8k ’ «| ,-a(t-s 
<e - +f lex (ys(1),..-,¥s(k — 1) - ggle CS) dy 


* T 
< en(S + i \gx(ys(1), 2+) ¥s(K — 1)) - sileras +e(l-e™). 


Letting t go to ov, we see that this last quantity is bounded by 2e. Sending to 0, we 
obtain 


et (eee 
t— oo a 


The quasispecies distribution Q is the normalized solution of the system (7.1). 
Rewriting this system with the help of the function g,, we obtain the relation 


Qk) = Q(0). 
(04 
We conclude that 
lim cx(K) = Jim e(0) y(k) = QO)=£ = Q(H), 


as wanted. oO 


Part III 
Error Threshold in Finite Populations 


Overview of Part III 


This part studies the impact of a finite population on the error threshold and the qua- 
sispecies. The central probabilistic results are presented in chapter 11. These results 
are the counterparts for finite population models of the error threshold phenomenon 
discovered by Eigen. We formulate them for two classical stochastic population 
models, namely the Moran model and the Wright—Fisher model. They take the form 
of a phase transition. The novel feature compared with the classical error threshold 
is that, instead of a critical point, we obtain a curve separating the quasispecies 
regime from the disordered regime. The crucial parameters in the limit are the mean 
number of mutations per chromosome per reproduction cycle and the limit of the 
ratio between the population size and the genome length. The critical curve of the 
new error threshold should be interpreted as a generalization of Eigen’s result: if a 
quasispecies is to form, not only should the mutation probability be small, but the 
population should be large too. In fact, when the ratio between the population size 
and the genome length goes to oo, Eigen’s original threshold is retrieved. When a 
quasispecies is formed, the distribution of this quasispecies is still Q(c, a), regard- 
less of the population size. We present empirical simulations illustrating these results 
in chapter 12. The proofs of these theorems are very long. In chapter 13, we give 
some heuristics justifying the result in the case of the Moran model. These heuristics 
could be developed into a rigorous proof, but we refrain from doing so, because this 
approach would be specific to the Moran model. Instead, we present the framework 
for a more robust approach in chapter 15. The full proof for the Wright—Fisher model 
is given in part IV. 


Chapter 11 ® | 
Phase Transition on 


Our goal in this chapter is to prove convergence results in the long chain regime, 
which are the counterparts of the convergence results proved in chapter 3. More 
precisely, we study simultaneously the limits 


moro, €>0, q-0, lq-a, a. 


The relevant models are the classical finite population models, namely the Moran 
model of section 4.3 and the Wright—Fisher model of section 3.3. The main results 
are very similar for both models, yet none is a consequence of the other, and they 
have to be proven separately. On the technical side, the central steps involved in the 
analysis of the Moran model rely on continuous objects, while for the Wright—Fisher 
model they rely on discrete objects. Therefore we handle the two models separately 
and we do not try to unify their treatment. Throughout this chapter, we denote by 
(Q(k))x>0 the quasispecies distribution Q(c, a) (see definition 7.1) given by 


Vk>0 Q(k) = (ce * - 1)— Lay 


11.1 The Moran Model 


In section 4.3, we considered the Moran model (X;);er+ and the associated con- 
centration process (C;)rer+ for a general set of genotypes E, a fitness function A 
and a mutation matrix M. We consider now the more specific framework introduced 
in section 6.3. We take E to be the set of the Hamming classes {0,...,€}, along 
with the sharp peak fitness function Ay and the lumped mutation matrix My. The 
concentration process (C;);cR+ in this framework is a Markov process with state 
space 


1 
Sot! = {ce foes y c(i) = 1} nN", (11.1) 
O<i<@ as 
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and infinitesimal generator Ly,¢,4 given by: for f a function from S&! to R and for 
any c € S&*1, 


Imtafle) = Sy el AntMa the F(e+ P=“) - fro), 


0<i,7,k<l 


where e(j) € N‘*! is the vector whose coordinates are equal to 0, except the j-th 
coordinate, which is equal to 1. Let ¢ : Rt — R* U {+0c0} be the map given by 


(l-e™“) 


o(l-e“)In e 
al 


1-o(1-e%) : 


+ In(oe“) 


Va <Ino g(a) = 


and ¢(a) = 0 for a > Inc. Here is the main result concerning the Moran model. The 
symbols E and Var denote the expectation and the variance. 


Theorem 11.1 Suppose that 


€ > +00, m— +o, q- 0, 


€q > a € ]0, +00[, = a [0,400]. 


We have the following dichotomy: 
e Ifad(a) < Ink, then 


Vk >0 lim _ lim E(C,(k)) = 0. 
€,m—> 0, q0 t—-0o 


€q-a, va >a 


e Ifad(a) > Ink, then 


Ve=0 tim fim E(C() = Qo, a\(k). 
,m—-0o, q-> 00 
qa, oa 


Furthermore, in both cases 


Vk>0 lim lim Var(C;,(k)) = 0. 
€,m—>00, q0_ t00 


qa, pa 


This theorem can be interpreted as a phase transition result. Indeed, if a@ é(a) < 
In x, then the invariant probability measure of the Moran process converges weakly 
towards the Dirac mass on 0, while for a (a) > Ink, it converges weakly towards 
the Dirac mass on the quasispecies distribution. The fact that d(a) = 0 for a > Ino 
is coherent with the error threshold computations for the Eigen model. Indeed, when 
a = Ino, the Moran model is in the disordered regime even if the population 
size m grows much faster than the genome length ¢. The novel feature compared 
with the classical error threshold is that, instead of a critical point, we obtain a 
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curve separating the quasispecies regime from the disordered regime. The crucial 
parameters in the limit are the mean number of mutations per chromosome per 
reproduction cycle a and the limit of the ratio between the population size and the 
genome length a. We have an explicit expression for the critical curve because, in the 
asymptotic regime, the dynamics of the master sequence in the Moran model is well 
approximated by a birth and death process. For birth and death processes, explicit 
formulas are available for hitting times and the invariant probability measure, and 
these formulas are amenable to an asymptotic analysis in the long chain regime. The 
full proof of theorem 11.1 relies on the construction of birth and death processes 
which approximate the dynamics of the concentration of each Hamming class. It can 
be found in [15, 17]. Surprisingly, the formula for the quasispecies distribution was 
initially found within this scheme, when trying to prove theorem 11.1. Only later did 
we realize that it was naturally the solution of the limit quasispecies equation [21]. 


11.2 The Wright-Fisher Model 


In section 3.3, we considered the Wright—Fisher model (X,,), <x and the associated 
concentration process (Cy,)nen for a general set of genotypes E, a fitness function A 
and a mutation matrix M. We consider now the more specific framework introduced 
in section 6.3. We take E to be the set of the Hamming classes {0,...,¢}, along 
with the sharp peak fitness function Ay and the lumped mutation matrix My. The 
concentration process (C;,)nen in this framework is a Markov process with state 
space S“*! (defined in formula (11.1)) and transition matrix given by: for any n ¢ N 
and c,d € SCtl 


P( Gut =| CG. Sc) = P(<Mutt(m (c)) = d), 


where ® is the mapping from S“*! to S“! given by: for0 < k < €andce S@!, 


Do<ice CHAnH(W)Mati, k) 
Nosice CHAn() 


Let us denote by /(p, f) the rate function governing the large deviations of a binomial 
law of parameter p € [0, 1]: 


P(c)(k) = 


1 
1 


t 


r) 


t 
Vt € [0, 1] I(p,t) = th—+(1-2)In 
Pp 
with the convention 0 In 0 = 01n(0/0) = 0. We define, for 7 > 1 anda € ]0, +00], 


if ce? >1 


if ce * <1 


Q(c,a)(0) = é a-1 
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I-1 = 
fie oe “pK . Po = Q(a,a)(0), pz = 0, 
via) = ge] Soe px +1 P**") * pp € [0,1] for0 < k <I 
Here is the main result concerning the Wright—Fisher model. The symbols E and 
Var denote the expectation and the variance. 


Theorem 11.2 We suppose that 


€ > +00, m — +00, q- 0, 


€q > a € ]0, +00[, = a [0,400]. 


We have the following dichotomy: 
e Ifaw(a) < Ink, then 


Vk >0 lim — lim E(C,(k)) = 0. 
€,m—00, q>0 n—-0co 


€q-a, e >a 


e Ifaw(a) > Ink, then 


Vk>0 lim lim E(C,(k)) = Q(o,a)(k). 
,N—-oo, gq) N—-00 


€q-a, pa 


Moreover, in both cases, 


Vk>0 lim lim Var(C,(k)) = 0. 
€,m—>0, q>0: n-00 


qa, waeaed 

This theorem looks very similar to its counterpart for the Moran model. Indeed, 
the two possible limits for the invariant probability measure are the same, namely 
Dirac masses on 0 or on the quasispecies distribution. We have also a critical curve 
separating the quasispecies regime from the disordered regime and this curve has a 
similar shape to the one obtained for the Moran model. The crucial parameters in 
the limit are also the same, namely the mean number of mutations per chromosome 
per reproduction cycle a and the limit of the ratio between the population size and 
the genome length a. The main difference, though, is that we do not have an explicit 
formula for the critical curve. Instead, the function w is defined as the solution to 
a discrete variational problem. So, it seems that the formula for the quasispecies is 
valid for several models in the infinite population limit, while the critical curve itself 
might change from one model to another. A full proof of theorem 11.2 can be found 
in [16, 20]. This proof relies on the construction of adequate couplings, which are 
simpler processes amenable to a mathematical analysis. We present in part IV a more 
efficient proof, which relies on the limit quasispecies equation and generic ideas from 
the Freidlin—Wentzell theory of random perturbations of dynamical systems [38]. 


Chapter 12 ® | 
Computer Simulations oe 


The results for the finite population models are supported by simulations. Fig- 
ures 12.1, 12.2, 12.3, 12.4, 12.5, 12.6 show the fractions of the master sequence 
and the first classes in the equilibrium population as a function of both a = €q and 
a = m/€, for both the Wright—Fisher model and the Moran model. In figures 12.1 
and 12.2, the ratio a = m/€ is fixed and we vary a = fq, while in figures 12.3, 12.4 
and 12.5, the parameter a = fq is fixed and we vary a = m/€. In figure 12.6, we 
vary simultaneously a = m/€ and a = €q, and we obtain a two-dimensional surface. 
The programs are written in C with the help of the GNU scientific library and the 
graphical output is generated with the help of the Gnuplot program. The number 
of generations in a simulation run was adjusted empirically in order to stabilize the 
output within a reasonable amount of time. Typically, in a simulation of the model 
with parameters ¢,m, the number of generations is taken to be 100000 x Qmax(¢,m) 
multiplied by a factor between 1 and 100. The good news is that, already for small 
values of @, the simulations are very conclusive. 
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Moran model, l=16, 0<a<1, a=3 
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Fig. 12.1 Density of the first mutant classes as a function of a = €q for the Moran model 
with o = 2, € = 16 (up), 32 (down), a = m/€ =3 
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Wright-Fisher model, l=16, 0<a<1, a=3 
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Fig. 12.2 Density of the first mutant classes as a function of a = €q for the Wright—Fisher model 
with o = 2, € = 16 (up), 32 (down), a = m/€ =3 
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Moran model, |=16, a=0.5, 1.5<a<4.5 
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Fig. 12.3 Density of the first mutant classes as a function of a = m/€ for the Moran model 
with o = 2, € = 16 (up), 32 (down), a = €q =0.5 
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Wright--Fisher model, l=16, a=0.5, 0.5<a<3 
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Fig. 12.4 Density of the first mutant classes as a function of a = m/€ for the Wright—Fisher model 
with o = 2, € = 16 (up), 32 (down), a = €q = 0.5 
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ae 
cera ea 


Wright--Fisher model, l=32, a=0.5, 1<a<2 


—_ 
=]|=COOANDNARWNH$-O 


_ 


DaRSN 


1.8 2 


Fig. 12.5 Density of the first mutant classes as a function of a = m/€, € = 32,a = €q =0.5,0 =2, 
Moran model (up), Wright-Fisher (down) 
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Moran model, Master sequence, I|=24 


Wright--Fisher model, Master sequence, |=24 


Fig. 12.6 Density of the master sequence as a function of a = €q and a@ = m/€, € = 24, 
Moran model (up), Wright-Fisher (down) 


Chapter 13 ® | 
Heuristics ome 


In the Moran model, at most one individual changes at a given time. Therefore the 
number of master sequences in the population can either decrease by 1, stay stable, 
or increase by 1. A priori, the probability of each move is a complicated function 
of the whole population, because we have to take into account the possibility of 
creating a master sequence through a mutation event. However, in the long chain 
regime, the probability of such a mutation event is of order 1/€ and can be neglected. 
Once the back mutations are neglected, the number of master sequences can be 
approximated by a genuine birth and death process. In this chapter, we explain how 
this approximation leads to the results of theorem 11.1, but we do not present a full 
rigorous proof. 


13.1 A Simplified Process 


Let us denote by N; the number of master sequences present in the population X; 
of the Moran process defined in section 4.3. In the long chain regime, we have the 
following expansions: 


P(Nrsat 
P(Ni+at 


i-1|N, 
i+1|N, 


(ci? (1 —e“) +i(m — i)) dt + o(dt), 
(ci(m — i)e~“) dt + o(dt). 


i) 
i) 
In the death transition rate, the term oi(1 — e~“) corresponds to a master sequence 
reproducing and mutating, whereas the term m — i corresponds to a non-master 
sequence reproducing (recall that back mutations have been neglected). The factor 
i corresponds to a master sequence dying. Likewise, in the birth transition rate, the 
factor oie“ corresponds to a master sequence reproducing and not mutating, and 
the factor m—i corresponds to a non-master sequence being replaced. The remainder 
terms o(dt) are uniform with respect to i < m. Therefore we are naturally led to 
compare the process (N;);>0 with a birth and death process (Z;);>9 on {0,...,m}, 
with the same transition rates. This process is a variant of the one introduced by 
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Nowak and Schuster [69]. The approximation of the number of master sequences 
N; in the Moran model by the birth and death process Z,; no longer works when 
N; = 0. When N; = 0, there is no master sequence and the Moran process evolves 
in the set of the neutral populations under the action of the mutations. When ¢ is 
large, the mutations induce a strong drift which drives the individuals away from the 
master sequence, typically at distance €(1 — 1/x). In order to get a simple picture, we 
replace the whole population by a single individual which evolves on {0,..., « }° 
under mutation. As usual, we decompose the genotype space into Hamming classes 
and we define 


Y, = Hamming class of the individual at time t . 


Applying the lumping theorem A.3, we see that the process (Y;);>9 on the Hamming 
classes is a Markov chain on {0,..., € } with transition matrix the lumped mutation 
matrix M7 introduced in section 6.3. Thus the Moran process is approximated by 
the process on 


({0,...,¢}x {0}) U({0} x {0,...,m}} 


described as follows. On { 0,..., €}{0}, which we identify with the set {0,..., €}, 
the process follows the dynamics of the Markov chain (Y;);>9 with transition ma- 
trix My. On {0} x {0,...,m}, which we identify with {0,...,m}, the process 
follows the dynamics of the birth and death process (Z;);>9. When at the point 
(0,0), the process can jump to either axis. We denote by (X r)r>0 this simplified 
process, whose state space is the union of the two segments {0,...,&} x {0} and 
{0} x {0,...,m}. Of course, it is far from obvious that results on this simplified 
process are relevant for the original Moran process. However, this turns out to be the 
case. This can be justified rigorously in two ways. A first way is to make effective use 
of the uniform asymptotic expansions above to derive a full rigorous proof. A sec- 
ond way is to construct couplings between the Moran model and judicious simplified 
processes. This is the approach implemented in [15, 17] (more precise asymptotic 
expansions are computed in [9, 10]). However our goal here is to understand on a 
simple model the mechanisms leading to the existence of an error threshold. So we 
will focus on the analysis of the simplified model, without further mathematical jus- 
tification. In part IV, we shall rigorously prove theorem 11.2 for the Wright—Fisher 
model. 


13.2 A Renewal Argument 


We suppose that the process (X r)r>0 Starts from (0,0). We define two sequences 
(Ti)k>1, (T)k>0 Of stopping times by setting t) = 0 and 
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a = inf {¢5.0:2%,=(0,1)}; Pes int tee Xk S00) t, 


% = inf {¢ > m1: X, =(0,1)}, Th = inf {#7 2X; = (0,0) }, 


For any k > 1, by the strong Markov property, the trajectory (X t)t>7, Of the process 
after time tT; is independent from the trajectory (X t)t<r, Of the process until time T;, 
and its law is the same as the law of the whole process (X ir>0 Starting from (0, 0). 
As a consequence, the successive excursions 


(Xp te Ser), k>1, 
are independent and identically distributed. In particular, the sequence 
(Tet — Tk) es 


is a sequence of i.i.d. random variables, having the same law as the random time 7 
whenever the process (X;);>0 starts from (0, 0). For k => 1, we decompose Tt; as the 
sum 


ki 
| = Ty +) (ta — Tp). 


h=1 


Applying the classical law of large numbers, we get 


, Tk 
lim 


jim 7 = Evo,0)(t1) with probability 1. 
We define next 
Vt>0 K(t) = max {k>0:% <t}. 
From the very definition of K(t), we have 
Vt >0 TK(t) St < TK@)4+1> 
and since Tg goes to co with k, then 


lim K(t) = +00 with probability 1. 
t—0o 


We rewrite the previous double inequality as 


TK(t) . te TK(t)+1 K(t)+1 
K(t) ~ K(t) K(t)+1 K(t) 


Sending ¢ to oo, we conclude that 
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K(t) _ 1 


lim —— 


= —_— with probability 1. 
tc t Eo,o)(71) 7 


Let N (Xx ,) be the second coordinate of the vector X,, which represents the number of 


master sequences present in the simplified process at time t. By the ergodic theorem 
(see theorem 16.1), we have 


= i ce 
lim E(o,9(N(X1)) = lim 5 / N(Xs)ds. 
— 00 00 0 


We decompose the last integral as follows: 


K(t) 


t _ Tk = t = 
/ N(X5)ds = » ‘i N(Xs)ds + / N(X;)ds, 
0 k=l Tre 
where TR()+ A t stands for miN(TK (41° t). For k > 1, the integral 


Tk = 
ee / N(X,) ds 
Th 


is a deterministic function of the excursion (x so Th-1 SS Tk)s hence the random 
variables (Nx, k > 1) are independent identically distributed. With probability one, 
K(t) goes to co as f goes to oo, thus by the classical law of large numbers, we have 
K(t) 
lim —~ Nx = Eo,o)(M1) with probability 1. 
100 K(f) 2, 0.0) 


Writing 
ae Ka) 1 8 a i 3 
= Xs a iN fae Xs > 
7 f N(Xs)ds = = Ey 2M + a N(Xs) ds 


=1 K(t)+1 
and letting t go to co, we conclude that, with probability 1, 
1 ft 1 mM 
jim ; i N(X5)ds = Feat J wX.)as] : 


Let us rewrite these quantities in terms of the processes (¥;);>9 and (Z;);>0. We 


define the persistence time T° as 


7 Sat 202.0) 


T1 on) 7 
Foo / N(Xs) as = el | Zs ds 
ree 0 


1 


We have then 


%=1). 
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As for the denominator, we write 
Eo@o(t1) = E@o(ti) + Eeoo)(t1 — 77) 
= Eo,0)(t;) + E(r° | Zo = 1) . 


The main contribution to the expectation E(0,9)(t;') corresponds to the trajectories of 
the process (Y;);>0 in the neutral phase. Thus, letting 


t = inf {t>0:¥, =0}, 
we have the approximation 
Eoo(t}) ~ E(to|% =0). 


For a single individual which evolves on {0,...,« }¢ under the mutation kernel M, 
the invariant probability measure is the uniform probability measure, therefore a 
classical formula of Markov chain theory yields that the mean return time to O for 
(Y)r>0 is 

E(t |% = 0) =x. 


Collecting together the previous computations, we obtain that 


0 


s I : 
lim Eeoo(N(X;)) ~ 2 ie 
lim Eo, (N(Xr)) xe + E(19| Zp = 1) (J oF 


Zo = i) 


Now everything boils down to the asymptotic behavior of the mean persistence 
time E(r° | Zo = 1). If the mean persistence time is negligible compared to kK, 
then the fraction of the master sequences at equilibrium will be null. If the mean 
persistence time is much larger than x’, then the number of the master sequences at 
equilibrium will be distributed as in the birth and death process (Z;),>0. Indeed the 
previous argument can be carried out for a general function of the number of master 
sequences. 


13.3 Persistence Time 


This section is devoted to estimating the expectation of the persistence time T°. 


So we consider a continuous time birth and death process (Z;),;>09 on the finite set 
{0,...,m} with transition rates y;, 6;,0 <i < m,i.e., for any t > 0, 

P(Zisat =it+1|Z, =i) = 6;dt+o(dt), O<i<m, 

P(Zisae =i -1|Z; =i) = y;dt+o(dt), O<i<m, 


where 
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vi = oi? (l-e*) +i(m-i), 6; = oi(m—ije*, OS<i<m. 


We suppose that all the transition rates are positive, except at the boundary, where 
Yo = 0 and 6,, = 0. We define 


m0) =1, mi) = 


: O<i<m. 


Oy + 6; 
Yio Vi 


Let 7° be the hitting time of 0, defined by 
¢ = inf {nS 0:Z, =0}. 


The following explicit formula for the expected value of T° can be found in classical 
books (for instance [52]): 


m 


E(t°| Zp = 1) = ? =n(i). (13.1) 


i=1 


Our first goal is to estimate the products z(i). We start by studying the ratio 6;/7;. 
We define a function f on [0, 1] by setting 


a(1- p)e® 
p(l-e“*)+(-p) 


Ypel0,1] fle) = - 


We have then F . 
Weticn wisi eee f(=). 
Vi m 
What matters for the behavior of the products z(i) is whether the ratio 6;/y; is larger 
or smaller than 1. If we take as new variable p = i/m, the equation 6; = y; can be 


rewritten as 
(o -1)p* +(1-oce™)p = O. 


The largest root of this equation is 


-a_y 
. = tees 1, 
p= al 
0 ifoe* <1. 
This readily implies that 
l<i<j<lp*m| = mi) < (J), 
Lo*m| <i<j<m = mi) = x(j). 


It follows that the product (i) is maximal for i = | p*m|: 


max z(i) = 2(Lp*m]). 


l<i<m 
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Let p € [0,1]. For m > 1, we have 

1 ee ocahé 

—Inz(Lpm]) = — Sy in f(=). 

m m m 


This sum is a Riemann sum. Sending m to ov, we get 


lim nations = f msoras. (13.2) 
m—oo mM 0 


Since f(s) > 1 for s < p* and f(s) < 1 for s > p*, the integral is maximal for 
p = p*. Moreover, we have the following bounds on 6;: 


Vie{l,...,m—1} ce * <6; < moe. 
Together with formula (13.1), this yields the following inequalities: 


1 


m2ae~4 


m(Lp*m]) < E(r°|Zp =1) < n(Lp*m]) . 


e 4 


Using the asymptotic estimate (13.2), we obtain that 


1 a 
lim —InE(t°| Zo = 1) =f In f(s) ds. 
mo Wm 0 


We compute the integral and we obtain the function ¢(a) defined in section 11.1. 
Thus we conclude that 


E(r° | Zo = 1) ~ exp (m(a)) . 


Comparing this quantity with «° in our asymptotic regime, we obtain the dichotomy 
presented in theorem 11.1. Only the heuristics of the proof of this theorem have been 
presented, in the sequel we shall present the full proof of theorem 11.2 in the case 
of the Wright—Fisher model. 


Chapter 14 ® | 
Shape of the Critical Curve oe 


The critical curves described in theorems 11.1 and 11.2 appear from the simulations 
to be increasing, that is, the larger a, the larger should a be for a quasispecies to be 
able to form. This amounts to the functions ¢(a) and w(a) being decreasing in a. 
The objective of this chapter is to prove that this is indeed the case. 


14.1 Critical Curve for the Moran Model 


We have seen in section 13.3 that the function (a) is given by the integral 


p*(a) 
g(a) = i inj, 


where _ A 
oe *— 
— ifoe* > 1, 
p (a) = o-1 bee 
0 ifoe*% <1, 


and the function f,(s) is defined for s € [0, 1] by 


o(1—-s)e~4 


os(l-e-@)+1-5° 


fals) = 


Since both p*(a) and f,(s) are decreasing functions of a, the function (a) is also 
decreasing in a. 
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14.2 Critical Curve for the Wright-Fisher Model 


The case of the Wright—Fisher model is trickier. Let us define the mapping Gy, : 
[0, 1] — [0, 1] by 


oe “s 


Vse [0, 1] G,(s) = (o-Ds+l1’ 


and let us recall the definition of the rate function /(p, t): 


1-t 


t 
Vpe [1] Vre[O1]  Up.f) = tin—+(1-fIn- 
P 


The function w(a) determining the shape of the critical curve in theorem 11.2 is 
given by 


I-1 
ee 0 = p'(a), pi = 9, 
Wa) = ingine | Y,11Gato0. prays Pe 0.1] iif aS | ; 
On one hand, the mapping G, is non-decreasing and it has two fixed points, 0 and 
p*(a). Moreover, for any r € (0, p*(a)) we have r < Ga(r) < p*(a) and for any 
r € (p*(a), 1] we have p*(a) < Gg(r) < r. On the other hand, the mapping /(p, t) is 
a non-negative convex function on [0, 1]? satisfying /(p,t) = 0 if and only if p = t 
(this can be checked by computing the second derivatives of J). Using these two 
observations, we see that for any sequence (px )o <x <; such that 


pi<pose@,  VEe{Oicyl=1} prs pe, 


we have 
Ga(pi-1) 2 Ga(po) = po = Pi 


whence also 


I(Ga(po), Pi) + +++ + (Ga(pi-1), pr) = MGalpi-1), pi) = MGa(po), pi) - 


Therefore, in order to realize the infimum in the definition of w(a), it is enough to 
consider sequences (px) <x <; that are decreasing. 

Let us consider two values a, < a2. We will prove that w(a,) > W(az). Let 6 > 0 
and let (px )o<x<; be a sequence in [0, 1] such that pp = p*(a1), p; = 0 and 


I-1 


I (Ga, (px), Pk+1) < w(a1) +6. 
k=0 
Note that p*(a) (which has been defined in the previous section) is a decreasing 


function of a, so that p*(a,) > p*(az). Let k* be the last index in 0,...,/ such that 
Pe« > p’ (az), and define the sequence (7x)o<x<j-x* by setting rg = p*(az) and 
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Vke{l,...,l-—k*} Tk = Pk*+k- 


We will show that 


I-k*-1 l- 


W(a2) < s (Gay (Tk), Tk+1) < » (Ga, (px), Pk+i) < Wai) +6. 


k=0 k=k* 


= 


The first and last inequalities are immediate from the choice of the sequences 
(px )osk<i and (rp )o<x<i-x« and the definition of the function w, so the only thing we 
need to show is the middle inequality. For that, we compare the terms pairwise. Let 
ke {l,...,1-—k* —1}. We have 


rk = Pk*+k < p' (a2) < p'(ai). 


Note that the function G,(s) is decreasing in a, and as long as s < p*(a) we have 
G(s) > s. Since we are considering sequences that are decreasing only, then 


Tht = Persks < Te < Galtk) < Ga, (Pe+k)- 


Therefore, 
(Ga, (pre+k)s Pke+k+1) > 1(Ga(rk), rk+1) - 


Finally, let us examine the case k = 0. We have 
Pre > p'(a2) = presi, 


whence 
Ga, (px) > Gar(pre) > p' (a2). 
It follows that 


(Ga, (pk) Pkevt) > 1(Gaz (Pr), Pke+1) 
> 1(Ga(p*(a2)), pee+1) = I(Gas(r0),11) - 


This concludes the argument. 


Chapter 15 ® | 
Framework for the Proofs ome 


The proofs of theorems 11.1 and 11.2, although different, share some common 
arguments. We present here a framework which works for both the Moran and the 
Wright—Fisher model. Recall that theorems 11.1 and 11.2 have been formulated 
in terms of the concentration processes C; for the Moran model and C,, for the 
Wright-Fisher model. Since £ goes to ov, the state space S“! = {c € [0,1]! : 
DNo<i<e CW) = 1} of the concentration processes becomes infinite-dimensional and 
the convergence towards equilibrium is quite delicate to handle. So our strategy is to 
work from the start in an infinite-dimensional setting, which we describe next. We 
define S as 


S® = fee loi: ie <i}. 


ieN 
For K > 1, we define the K-dimensional set 
K-1 
SK = {ce [0.1 : Sc) < 1}, 
i=0 


and the natural projection zx from S® (or S“ when £ > K) to S* by 
VcoeS™ mx(c) = (c(0),...,c(K - 1)). 


We endow S® with the product o-field, which is the smallest o-field such that all 
the projections 7x are measurable. We say that a sequence (v,,),>1 of measures on 
S® converges weakly towards a measure v if and only if, for any K > 1, the first K 
marginals converge to those of v, i.e., for any K > 1, for any continuous function 
fF: S*K >R, 


lim f (7K (c)) dv;(c) = I f (aK (c)) dv(c). 
Sse se 


now 


Each finite-dimensional simplex is embedded into S® and we think of a probabil- 
ity measure on a finite-dimensional simplex as a probability measure on S®. Let 
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K > 1. The set S* is compact. By Prohorov’s theorem (see for instance [11], chap- 
ter 1 section 6), the set of the probability measures on S* is sequentially compact 
for the weak convergence. Using a standard diagonal argument, together with Kol- 
mogorov’s extension theorem, we see that the set of the probability measures on S” 
is also sequentially compact. Our strategy to prove the convergence of the invariant 
probability measures of the finite models towards the Dirac mass on the solutions to 
the quasispecies equation relies on the following standard result. 


Proposition 15.1 Let (v,)n>1 be a sequence of probability measures on S~. Suppose 
that there exists a probability measure v* on S® such that any weakly converging 
subsequence of (Vn)n>1 converges towards y*. Then the whole sequence (Vn)n>1 
converges weakly towards v*. 


Proof Let (vn)n>1 and v* be as in the hypothesis of the proposition. Suppose that 
the whole sequence (v,,),>1 does not converge weakly towards v*. Then there exists 
K > 1 and a continuous function f : S* — R such that, for some « > 0 and for 
some subsequence (V¢(n))n>1; 


Zé. 


| [ Fr(0) arma - f Preto) a0 
oho rela 


Now, since the set of the probability measures on S® is sequentially compact, 
we can re-extract from (V¢gn))n>1 a Converging subsequence. By hypothesis, this 
subsequence should converge towards v*. This stands in contradiction with the 
above inequality. oO 


15.1 Candidate Limits for Moran 


We denote by Vn,¢,¢ the invariant probability measure of the concentration process 
(C;)rer+. Theorem 11.1 is in fact a result on the asymptotic behavior of Vm,¢,g. 
Indeed, we have, for k > 0, 


jim E(C,(k)) 7 fe c(k) Vm,t,g(C); 


2 
tim, var((G) = ff (0)? ymealod=(f. (K Ymca) 


So our strategy is to focus on Vjn,¢,, and to describe its possible weak limits in our 
specific asymptotic regime. We denote by (®{,);<r+ the semigroup of transformations 
on S® associated to the limiting differential system of Eigen’s model. More precisely, 
for c € S®, the unique solution of the system (10.2) with initial condition c at time 
0 is given by 

Vt>0 Vk2=0 c(k) = ®5(c)(k). 
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Proposition 15.2 In the long chain regime, any weak limit of a subsequence of the 
family Vm.e,q is invariant under the semigroup of transformations (®),);er+. 


Proof Let K > 1 and let f : SS — R be acontinuous function. We have 


I. Lintq(f ° RKC) dvm,e,g(c) = 0. (15.1) 


We shall proceed as in the proof of theorem 4.4. As m goes to oo, the rescaled 
generator mL, ¢,q converges towards the differential operator L., defined as follows: 
for any function f which depends only on a finite number of coordinates, for any 
ce S®, 


i-k 
Ln fle) =D) e(k)An(ke*@ AL (e) — of) An (el) 


O<k<i (i — k)! dc) j.k>0 


Fe), 
dc(j) 

(15.2) 
This is the analog of formula (4.4) in our case. Suppose now that y* is a probability 
measure on S® which is the weak limit of a subsequence of Vn,¢,7. Passing to the 
limit along a subsequence in equality (15.1), we see that y* satisfies 


i _ Lol f om (6) dv*(e) = 0. 


This is true for any continuous function f : S* — R and any K > 1. Thus y* is 
invariant for the infinitesimal generator L.,. Now the generator L.. is precisely the 
infinitesimal generator of the semigroup of transformations (®4,),¢p+, that is, for any 
function f which depends only on a finite number of coordinates, for any t > 0, 


d 
dt Js 


FOO) dv") = f Lafledaro), 


This identity can be checked by a direct computation, using formulas (10.2) 
and (15.2). A key point is that the limiting system (10.2) is triangular, so that 
the dynamics of the first K coordinates do not depend on the other coordinates; 
accordingly, for any K > 1, we can define the restriction ®) of ®{ to the first K 
coordinates, and we have 


Vee S® — ax(®L(c)) = ©%(rK(0), 


so that in the end the computations involve only a finite number of variables. This 
way we conclude that y* is also invariant for the semigroup of transformations 
(O,,)rer*- oO 
We denote by do the Dirac mass on the null sequence and by dai¢,q) the Dirac mass 
on the quasispecies distribution Q(c, a). 

Proposition 15.3 The only probability measures on S© which are invariant under 
the semigroup of transformations (®',)+eR+ are the convex combinations of 59 and 
6Q(c,a)- 
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Proof Let v be a probability measure on S® which is invariant under the semigroup 
of transformations (®4,);cr+. Let K > 1 and let f : S* — R be a continuous 
function. We have 


Vt>0 [flax dv(c) = J. Flo) dv(c). 


The convergence of the trajectory (O{(c));er+, for any c € S®, is analyzed in 
theorem 10.2. Let us introduce the set 


So = {ce S*:c(0)=0}. 


Applying the dominated convergence theorem to the right integral, we obtain that 
i f(tK(c)) dv(c) = v(So)f(tK)) + VS® \ So) f (4K (Q(c, a))) « 


This identity holds for any K > 1 and any continuous function f : Sf > R. We 
conclude that 
v = W(So) do + WS \ So) dQ(e,a); 


hence v is a convex combination of 69 and da(c,a). oO 


15.2 Candidate Limits for Wright—Fisher 


We denote by Vn,¢,¢ the invariant probability measure of the concentration process 
(Cn)nen. Theorem 11.2 is in fact a result on the asymptotic behavior of Vm,¢,4- 
Indeed, we have, for k > 0, 


Jim, E(Ca() = ff eb) Yncal0d 


2 
lim, van((Cu) = (tk)? ymcaled ( hes AK) Ymca] 


So our strategy is to focus on ¥,¢,g and to describe its possible weak limits in 
our specific asymptotic regime. The mean of the multinomial law appearing in the 
transition mechanism of the Wright—Fisher model is given by the map ® defined in 
formula (3.4). The limit of the map © in the long chain regime is the map ®, from 
S® to S®, which is given by 


a 


(k —i)! 
(oa — 1)c(O) + 1 


», c()An(e4 


O<i<k 


VoES™ VKE>E0  Oi(c)(k) = 
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This expression for the map ®. was computed in section 10.1, when studying the 
limit of the Moran—Kingman model. 


Proposition 15.4 In the long chain regime, any weak limit of a subsequence of the 
family Vm,e¢,q is invariant under the map ©. 


Proof Let K > 1 and let f : S“ — R be a continuous function. We shall proceed 
as in the proof of theorem 3.5 to control the difference 


[ far(@)drmed@ ~ [ ferK(@0)) drmeq0)- 
S© Ss” 
We rewrite this difference as 


SY ymca) >) (FGrK(@) — fre ((c))))Prob(Mult(m, ®(c)) = md) . 


céSpm, deSm 


Let us fix ¢ > 0. The function f is uniformly continuous on S*, thus there exists a 
6 > 0 such that 


Ve.deSK  |c-dl<6 => |f(d)-fd@|<e. 


We decompose the sum over d € S,,, as 


Sos 3 ‘ees + 


deSm déSm, |tK(d)-7K (O(c))|>6 déSm, |tK(d)-7K (O(c))|<6 


We bound each term separately to get 


ay (f(aK(d)) — f(aK(P(c)))) Prob(Mult(m, ®(c)) = md)| < 


deSm 
QILf loo Prob(|rx (Mult(m, ®(c))) — max ((c))| > m6) te. 
This last probability depends only on the first K components of the multinomial 


law. With the help of Hoeffding’s inequality A.4 (in the appendix), we obtain the 
following bound: 


ms? 


Prob(|rx (Mult(m, ®(c))) — max(®(c))| > m6) < 2K exp ( ae 


Plugging these inequalities into the sum over c, we obtain 


| 1 foe / CONE MEE. 
Ss” Se 


2mé? 
< ll flloK exp(- 5) + 
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In addition, we have the expansion 
. : : 1 
Vie {0,...,K} ®(c)(i) = O.(c)(i) + 05). 
This expansion is uniform with respect to c € S®. Thus, for € large enough, we have 
Voe S™ [rx (®(c)) - Tx (®20(6))| < 6, 


whence 


<€. 


I F(aK(®(C))) dVm,e,q(©) - f F (2K (Pool) dvm,e,q(©) 
se Ss” 


Let v* be a weak limit of a subsequence of the family v,,¢. Passing to the limit 
in the above inequalities along the subsequence and sending « to 0, we see that v* 
satisfies 


I f(x(c)) dv"(c) = I F (tx(®a(c))) dv"(c). 
Ss” Ss? 


This holds for any K > 1 and any continuous function f defined on S*, therefore 
y* is invariant under ®.. Oo 


We denote by do the Dirac mass on the null sequence and by d@(¢,q) the Dirac mass 
on the quasispecies distribution Q(c, a). 


Proposition 15.5 The only probability measures on S© which are invariant under 
the map ®,, are the convex combinations of 69 and 6Q(c,a): 


Proof Let v be a probability measure on S® which is invariant under the map ®,,. 
Let K > 1 and let f : SS — R be acontinuous function. We have 


I ORO LO j F (te (@sa(c))) dv(c). 
Se Ss? 


Iterating the previous identity, we obtain 


Vn > 1 J, Flax) dv(c) = I. f (7K ((Bo0)"(c))) dv(c). 


The convergence of the sequence ((®..)"(c))nen, for any c € S®, is analyzed in 
theorem 10.1. Let us introduce the set 


So = {ce S*:c(0) =0}. 


Applying the dominated convergence theorem to the right integral, we obtain that 


i _ F(AK (©) dv(c) = v(So)F (7K (0) + VS” \ So) f (7K(Q(e,@))) . 
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This identity holds for any K > 1 and any continuous function f : SS > R. We 
conclude that 
v = v(So) 60 + VWS™ \ So) baw,a)> 


hence v is a convex combination of 69 and dQ(c,a)- oO 
This proposition is particularly relevant, it implies that in order to prove theorem 11.2, 


it is enough to study the number of master sequences in the population. Indeed, it 
suffices to prove that, for any continuous function from R to R, 


‘ _ | FO) if aW(a) < Ink, 
ee [. pO ea f(@Q(o,a)(0) ifay(a) > Ine. 


€q-a, pa 


However the proof of this statement is quite long and it requires several steps. This 
is the object of part IV. 


Part IV 
Proof for Wright—Fisher 


Overview of Part IV 


This part is devoted to the full proof of theorem 11.2 for the Wright—Fisher model. 
We present the general strategy of the proof in chapter 16. The proof involves some 
general results on the invariant probability measure of a Markov chain, which we 
recall as well. The proof is quite long and technical, because it requires several 
different probabilistic estimates, which are themselves quite delicate. Schematically, 
on the sharp peak landscape, there are two attractors: the quasispecies and the 
disordered state. The Wright—Fisher process oscillates between these two attractors 
and the game consists in understanding which of them is most stable asymptotically, 
depending on the parameters a and a. We have structured the proof by deriving 
separately the relevant estimates for each attractor. We need to control the following 
three elements: 

e The amount of time spent far from the attractor; 

e The probability of reaching a small neighborhood of the attractor; 

e The probability of exiting the basin of attraction of the attractor. 

The various estimates for both attractors must then be used simultaneously in the 
formula for the invariant measure, or rather a general inequality deriving from it, 
in order to conclude that one of the attractors is asymptotically less stable than the 
other. Chapter 17 deals with the non-neutral phase, which plays the role of the basin 
of attraction for the quasispecies. Chapter 18 deals with the mutation dynamics of 
one single individual, it is a preliminary to understand the collective behavior of 
the population in a selectively neutral fitness landscape. Chapter 19 deals with the 
neutral phase, which plays the role of the basin of attraction for the disordered state. 
At first sight, the treatment of the quasispecies and the disordered state look similar, 
because they rely on ideas coming from the Freidlin—Wentzell theory of random 
perturbations of dynamical systems. However the very nature of these two attractors 
is fundamentally different. The quasispecies attractor is created by a stable fixed point 
of a discrete dynamical system, namely the Moran—Kingman model. The disordered 
attractor originates from the classical law of large numbers. The final synthesis is 
accomplished in chapter 20. 


Chapter 16 ® | 
Strategy of the Proof oo 


The aim of part IV is to prove the theorem 11.2. The heuristics that we have given for 
the Moran model in the previous chapter do not work in the case of the Wright—Fisher 
model, the reason being that the Wright—Fisher model replaces the whole population 
at each time step, thus making the birth and death approximation impossible. It is 
still true, though, that the dynamics of the concentration process (C;,)nen associated 
to the Wright—Fisher process (X;,)nen (defined in section 11.2) is very different 
according to whether the master sequence is present or not in the population. In this 
chapter we present the strategy that we follow in order to prove theorem 11.2. 


16.1 Main Ideas 


In view of proposition 15.5, it is enough to show that, for any continuous function f 
from [0, 1] to R, 


: _ Ff (0) if aw(a) < Ink, 
snes I ~FleO)dvmeale) = f(Q(c,a)(0)) if w(a) > Ink. 


qa, o >a 
We recall that v,,,¢, is the invariant probability measure of the concentration process 
Cn)nen. From the ergodic theorem 16.1, we know that, with probability one, 
g Pp y 


n—1 
[f.FleO)armcated = jim. 5 SFC). 


A possible strategy to prove the result consists in introducing a sequence of stopping 
times as follows. Let us denote by N the populations which do not contain any 
master sequence and by M those which do contain at least one master sequence. We 
call the set N the neutral set and M the non-neutral set. We define iteratively two 
sequences (7; )x>1, (T)k>0 Of stopping times by setting 7) = 0 and 
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tT = inf {n>0:C,(0)>0}, T = inf {n >t 2 C,(0)=0}, 


ial 


% = inf {n > t%1: C,(0) > 0}, Th = int {Aet, oC.) S01, 


We would then study the successive visits to the sets NV and M, i.e., the excursions 
(Cam Sn Ss t%4)s (Ci % SHS Tey kA; 


in order to compute estimates on the laws of the sequences of time intervals 


* * 
Tha Tk, Tk-Th, k>1. 


The point is that, depending on the values of the parameters a and a, asymptotically, 
the amount of time spent by the process in one of the sets N or M becomes 
negligible compared to the other. This gives rise to the phase transition described by 
theorem 11.2. The main difficulty is that the above sequences are neither identically 
distributed nor independent. In [16, 20], the Wright—Fisher model was coupled with 
simpler processes, in which the entrance point to the set NV was forced to be unique. 
This made it possible to use a renewal argument, comparable to the one presented 
in section 13.2 for the Moran model, and to perform the required computations on 
the bounding processes. 

To avoid the tedious construction of monotone couplings, another possibility 
would be to develop estimates on the entrance and exit times of the sets NV and M, 
which would somehow be uniform with respect to the starting population, provided 
this starting population belongs to an adequate subset of N or M. This is the approach 
we take here. In addition, we will bypass the use of the ergodic theorem with the 
help of a beautiful representation formula for the invariant probability measure. 
This formula, presented in theorem 16.3, yields bounds on the invariant measure as 
follows. For A a non-empty subset of the state space, we define the hitting time tT, 
of A as 

TA = min {n>1:C,€A}. 


Now, in lemma 16.4, it is shown that, for any subsets V, G, we have 


Vm,eq(G) < sup P(tG < tv|Co =c) x sup E(ty|Co =d). (16.1) 
ceEeV déG 


We shall apply this inequality with judicious choices for the sets V and G, with 
the goal of proving that the invariant measure of one of the sets NV or M vanishes 
asymptotically, depending on the values of a and a. In fact, the idea is to choose a 
set V which attracts all the probability mass, like a sink. If one manages to prove 
that, uniformly over the starting point in V, it is extremely unlikely to enter G before 
paying another visit to V, and that, uniformly over a starting point in G, the expected 
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hitting time of V is bounded from above, with quantitative estimates such that the 
product on the right-hand side of formula (16.1) vanishes, then we are done. Now 
there is a further problem. We cannot simply apply formula (16.1) with N and 
M, because, if we start from the boundary of one of these sets, the probability of 
jumping to the other set is far too large compared to the corresponding expectation, 
and nothing can be concluded. We will instead work with choices of the set V which 
correspond to neighborhoods of the attractors of the process, that is subsets of NV and 
M which describe the typical populations, where the process spends most of its time. 
When the process starts from these neighborhoods of the attractors, the probability 
of jumping directly to the basin of attraction of the other attractor is exponentially 
small in one of the parameters m or ¢, while the corresponding expected value for 
the hitting time is exponentially large in the other parameter. Here are our choices 
for the sets V and G: 

e To prove the existence of the quasispecies regime, we introduce the following 
subsets, for € > 0: 


Nz = {ce S™:c(0) <«}, Mz = {ce S*:c(0) > Q(0)-e}. 
We apply inequality (16.1) with the sets G = Nz, V = M3: 


Vméq(Ne) < sup P(ty, < TM: | Co = d) x sup E(tm: | Co =i). 
deM: ‘. déNg : 


e To prove the existence of the disordered regime, we introduce the following subsets, 
for e > 0: 


N* = {ce S™: c(i) =0,0<i < &(1-1/x)- Veine}, 
M = {ceS”:c(0)>0}. 


We apply inequality (16.1) with the sets G = M, V = N*: 


Vmeq(M) < sup P(tm <ty-|Co =d) x sup E(ty-|Co = d). 
deN* deM 
Next we must obtain the relevant estimates for the quantities appearing in these 
inequalities. The strategies are quite similar in both cases. In each case, the first 
terms are 


sup P(tw, <tm:z|Co = 4), sup P(tm < tx |Co=4), 
deM: deN* 


which correspond to the probability that one hitting time is smaller than the other. 
More precisely, we have to estimate the probability that, starting in the neighborhood 
of one attractor, the process ends up in the basin of attraction of the other attractor 
before visiting again the attractor close to its starting point. This is a classical 
paradigm of the Freidlin—Wentzell theory. We first develop an estimate proving 
that the process is extremely unlikely to stay outside of the neighborhoods of the 
attractors for a long time. Then we are left with trajectories which must accomplish 
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the transition in a short time, and probabilities of the form 


sup P(C,...,Cw-1 ¢ Ne U M3, Cn € Ne|Co = d), (16.2) 
deM; 

sup P(C\,...,Cv-1 €N* UM, Cy € M|Co =), (16.3) 
deN* 


where N is a fixed integer, and this allows us to carry out the estimates. As for the 
second terms, 


sup E(ty: |Co =), sup E(tx|Co = d), 


déeNe deM 


we must obtain an upper bound on the expected hitting time of an attractor, when 
we start in the basin of attraction of the other attractor. The main contribution to the 
expectation comes from the trajectories which first visit the attractor corresponding 
to the basin of their starting point, stay there for a very long time, and then later on 
jump to the other basin of attraction. A simple technique to bound from above the 
expectation of these hitting times is to compute a lower bound on the probability of 
jumping in a short time from one attractor to the other, which is uniform over the 
basin of attraction, of the form: 


Vde Ne  P(tm, <N|Co=d) > B, (16.4) 
VdeM P(tw: < N|Co=d) > B, (16.5) 


where N is a fixed integer. Because the lower bound is uniform, we can then decom- 
pose a long interval into short intervals of length NV, where the jump could happen; 
with each of these intervals, we associate an independent coin of parameter 6 to 
decide whether the jump occurs, and this way we can compare the hitting time with a 
simple geometric law. Therefore, for both terms, the probability and the expectation, 
we have to study the trajectories which perform in a short time the transition from the 
neighborhood of one attractor to the basin of the other attractor. Notice that all the 
estimates we have been seeking so far concern trajectories over a fixed time interval. 
Once we have all these estimates in hand, we can plug them into the representation 
formula to obtain estimates on the invariant probability measure. This is the magic 
of the Freidlin—Wentzell theory: from estimates on trajectories occurring during a 
short time interval, we derive estimates on the long-time behavior of the process. A 
noteworthy point is that we must study both the neutral and the non-neutral phases in 
order to derive the lower bounds (16.4) and (16.5). Indeed, the relevant trajectories 
typically start close to one attractor, perform the exit from their basin of attraction, 
and enter the basin of attraction of the other attractor, which they finally reach. To 
avoid jumping back and forth between the two phases while computing the estimates, 
we choose to study each phase separately. First we derive all the required estimates 
for each attractor, and we combine them afterwards to get the desired lower bounds. 

Although the global strategy is the same for the quasispecies and the disordered 
regimes, the nature of the attractors in each case is very different. In the quasispecies 
regime, the attractor originates from a stable fixed point of the quasispecies equation. 
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In the disordered regime, the attractor originates from the classical law of large 
numbers. This is why it is more convenient to study separately the dynamics of the 
process (C;,)nen when the master sequence is present in the population, and when it 
is not. 

e When the master sequence is present in the population, and until it is lost, the 
dynamics of the population is not neutral. We call this the non-neutral phase. The 
study of the non-neutral phase will rely on a large deviations principle for the transi- 
tion probabilities of the number of master sequences, which shows that the number 
of master sequences can be seen as a random perturbation of a discrete dynamical 
system. The non-neutral phase contains one stable attractor, corresponding to the 
quasispecies distribution. 

e When the master sequence is not present in the population, all the individuals in 
the population have fitness 1, and until the discovery of the master sequence, the 
dynamics of (Cy)new is that of a neutrally evolving population. We call this the 
neutral phase. The study of the neutral phase will consist in focusing first on a single 
individual, who explores the state space by moving according to the mutation matrix 
My. Afterwards, we shall relate the dynamics of the whole population to that of a 
single mutant. The neutral phase contains one stable attractor, corresponding to the 
Hamming class at distance (1 — 1/x) from the master sequence. 

The next section 16.2 is devoted to the exposition of some results on the invari- 
ant measure of a general discrete Markov chain. In chapter 17, we deal with the 
non-neutral phase, while the neutral phase is examined in chapters 18 and 19. In 
chapter 20, we combine the results obtained for the neutral and non-neutral phases in 
order to complete the proof of theorem 1 1.2. The overall structure of the argument is 
a bit intricate, because the result on the quasispecies regime relies on the estimates 
for the probabilities (16.3) and (16.5), while the result on the disordered regime 
relies on the estimates for the probabilities (16.2) and (16.4). In order to obtain the 
relevant estimates for all these probabilities, we have to consider trajectories with 
two distinct portions: a first portion escaping from the initial attractor and a second 
portion reaching the final attractor. Thus each portion of the trajectory occurs in a 
different phase of the population space. To sum up, in each phase we shall essentially 
prove two estimates, one on the probability of reaching the equilibrium inside the 
phase, and one on the probability of escaping from the phase. These four estimates 
will be combined adequately to prove the two parts of theorem 11.2! 


16.2 Invariant Probability Measure 


We consider a discrete time Markov chain (X;);>9 with values in a finite state space 
&, and with transition matrix (p(x, y))x,yes. If a Markov chain is irreducible and 
aperiodic, then it admits a unique invariant probability measure y, i.e., the set of 
equations 


My) = » M(x)py), yeé, 


xe& 
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admits a unique solution. We state next the ergodic theorem for Markov chains. We 
consider only the case where the state space & is finite. 


Theorem 16.1 Suppose that the Markov chain (X;)t>0 is irreducible and aperiodic. 
Let yt be its invariant probability measure. For any initial distribution fo, for any 
function f : & — R, we have, with probability one, 
pi 
lim = J" f(Xx) = I F(x) dulx). 


t 
co ft 20 Ss 


The Markov chain (X;);>0 is said to be reversible with respect to a probability 
measure v if it satisfies the detailed balanced conditions: 


Wx,yEe& v(x) p(x, y) = v(y) p(y, x). 


If the Markov chain (X;);>0 is reversible with respect to a probability measure v, then 
y is an invariant probability measure for (X;);>0. If (X;)>0 is in addition irreducible 
and aperiodic, then v is the unique invariant probability measure of the chain. 


Lemma 16.2 Suppose that «is an invariant probability measure for the Markov 
chain (X;)1>0. We have then 


Yxye&E Vtr>0 H(x)P(X, = y|Xo =x) < wy). 
Proof The proof is done by induction on ¢. For t = 0, we have 
P(X = y| Xo = x) =0 if y#x, 
and the result holds. Suppose it has been proved until time t €¢ N. We have then, for 


xyed, 


w(x) P(Xr+1 = y|Xo = x) = SY) wa) P (Xe =y, X; =z|Xo = x) 
zEG 


= > POPOG =z | Xp =F aa = 9 es = > H(z) p(z,y) = my) 
zE&S zEe&S 


and the claim is proved at time f + 1. oO 


We give next the representation formula for the invariant probability measure. This 
formula can be found in the book of Freidlin and Wentzell (see chapter 6, section 4 
of [38]), where it is attributed to Khas’ minskii, and in the book of Kifer [54, 55]. 


Theorem 16.3 (Representation formula) Let us suppose that the Markov chain 
(X;)¢>0 is irreducible and aperiodic. Let pt be the invariant probability measure of 
(X;)r>0. Let V be a non-empty subset of &. We define the hitting time ty of V as 


Ty = min {n>1:X,€V}. 
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We have then, for any subset G of &, 


Ty -l 


w(G) = Y" wa) E( ps lo(Xx)| Xo = 2). 


xeV 


Proof With standard computations involving the Markov property, we check that 
the formula above satisfies the equation of the invariant measure. The state space & 
is finite and the measure yp is additive, therefore we need only to check the validity 
of the formula for singletons. For G = { y }, the formula becomes 


wy) = > woe I(xeey} | Xo = x) (16.6) 


xeV k=0 


In order to check that the invariant probability measure y satisfies this equation, we 
define a measure v on & by setting 


ty-l 


Wes wy) = we E( » Iexeey} | Xo = 1) (16.7) 


xeV 


and we check that v is an invariant probability measure for the Markov chain (X;);s0. 
Since the invariant probability measure is unique, this will imply that v = ju, and that 
py satisfies the formula (16.6). So, let us start the computations with vy. Let z € &. 
We have 


>) 0) P02) = >) DH) E( > I xeey} | Xo = x) p(y.2) 


yeS yeExeV 
= Yn) E( a lx, =y, Xk+1 =z} Xp = x} 
yeE xeV 
ty -l 
= Ya e( ds 1 {Xps1=2} Xp = x} 


xeV 


X)=x), 


If z does not belong to V, then this formula coincides with formula (16.7), and we 
conclude that 


= 5) eS 10 -2) 


xeV 


» v(y) p(y, z) = v(z). (16.8) 


yes 


For z in V, the previous computation yields 


SY) v) P02) = >) uO) P(X = 2] Xo =x). (16.9) 


yes xeV 
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We evaluate next the sum on the right-hand side. To do so, we consider the sum over 
the whole space & instead of V. Using the Markov property and the fact that pis an 
invariant probability measure, we have 


» pre. = z| Xo =a) = 


xe& 


SY) we) P(X =z|Xo =x) + YS) w(x) P(tv > 1, Xp, = z|Xo =x) 


xES xeS 
= pu(z) + b>? SY) u(x) P(X =u, ty > 1, Xy, = z| Xo =x) 
xE€E u€V 
= uz) + SD) wa) P(X = u| Xo 25) PUG: = z|Xo =< X; =a) 
xe&E u€V 
= wz) + >) D) Hx) px, w) P(Xry = z| Xo =u) 
xeEE u¢V 


= plz) +)? plu) P(X, = z| Xo =u). 
u€¢V 


Substracting the sum over u ¢ V, we deduce from this equation that 
Y) ul) P(Xey = z|Xo =x) = ule). 
xeV 


Substituting this identity into formula (16.9) and noting that (z) = v(z) for z € V, 
we see that equation (16.8) holds also for z € V, therefore v is an invariant measure 
for the Markov chain (X;);>0. It remains to check that v is a probability measure. To 
this end, we compute the sum 


Dd) HO) E(tv | Xo = 2) 


xe& 
=D) >) x) ply) + >) >) He) E(tv1 (x=93| Xo = 2) 


xeEE yeV xe€& yEV 
= 1 DY 2@) py) + 1) DY) xO) pl y)(1 + E (rv [Xo = y)) 
xeEE yeV xe€& y¢V 


=1+ YS) uQ) E(tv [Xo =9)- 
yéV 


Substracting the sum appearing in the right-hand side, we obtain 
YY u(x) E(ty | Xo ee 
xeV 


Yet, from formula (16.7), we see that this quantity is precisely equal to v(6), so v is 
indeed a probability measure. oO 
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16.3 Upper Bounds 


Next we use the representation formula for the invariant measure to derive an upper 
bound on it. 


Lemma 16.4 For any subsets V, G of &, we have 


u(G) < sup P(tc < ty | Xo = x) sup E (ty | Xo = y). 
yeG 


Proof From the formula in theorem 16.3, we deduce that 
TYE 1 


H(G) < sup E ya Lo(X)| Xo = x). 


Let us try to bound the last expectation. We denote by E, the expectation for the 
Markov chain starting from x. We have 


a(S 1o(X0)) = = Ex (> lrosr lg) Dy 1o(Xx)) 
k=0 yeG 
= iE mI > Lo(Xe)) Px (tc < Ty, XrG = y) 
yEeG 


< >) El ty)P. lite < ty ee =y) 
yeG 


< sup Ey (tv) P x(t < ty) 7 
yeG 


Taking the supremum over x € V, we obtain the inequality stated inthe lemma. oO 


The bound in lemma 16.4 involves the expectation of the hitting times. We present 
next simple bounds on these hitting times. 


Lemma 16.5 Let A be a subset of & and let t, be the hitting time of A. If there exists 
an integer k and a positive number B such that 


Wx€A  P(task|Xo =x) = B, 
then we have 
VxdA Vt>1  P(ta>t|Xo=x) < 1 pyle, 
Vx €A Eajmex) 2 2: 
B 
Proof Throughout the proof, we write simply t instead of t4. Reversing the inequal- 


ity, we have 
Vx¢A P(t>k|Xo=x) < 1-8£. 
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Let ¢ > 1. To obtain the first inequality stated in the lemma, we divide the interval 
{0,...,¢}inton = |t/k] subintervals of length k, and we apply the above inequality. 
Let next x ¢ A and let n > 1. We condition on the state visited by the process at time 
(n — 1)k and we use the Markov property to get 


P(t > nk | Xo =) = SY) P(r > nk, Xin-1yk = | Xo =%) 
yéA 


— Pe > nk|t > (n— Ik, Xn-1ye = y) 
ane x P(t > (n- 1k, Xn-1yk = y| Xo = x) 


< > Pe > k | Xo = y)P(t > (n—- Ik, Xn-1k = y| Xo = x) 
ytA 
< (1-f)P(t >(n- Dk | Xo = x). 
Iterating this inequality, we obtain that 


Vx €A Vn>1 P(t >nk|Xo =x) < (1-£)". 


We compute the expectation of t as follows: for x ¢ A, 


fo) co k-1 
E(r|Xo =x) = )' P(r >n|Xo =x) < ))) P(r > ik +1| Xo =x) 
n=0 i=0 [= 
< )'kP(r > ik| Xo =x) < mage, 
i=0 i=0 B 
as claimed. oO 


To employ lemma 16.5, we will have to estimate the probability of hitting a target 
subset in a fixed time interval. It turns out that, in some situations, it is convenient to 
decompose the corresponding trajectories into two parts: a first part which reaches 
a convenient subset and a second part which reaches the target subset. We then use 
the following inequality to combine the estimates. 


Lemma 16.6 Let A Cc B be two subsets of & and let Ta, Tg be the associated hitting 
times. For any x € 6 and a,b > 0, we have 


P(ta4 <atb|Xo =x) = P(tp < b| Xo =x)x inf P(ta < a|Xo = y). 
ye 


Proof Obviously, we have tg < t,. Let x € & and let a, b > O. In order to estimate 
Ta, we perform an intermediate conditioning with respect to Tg as follows: 
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P(ta <atb|Xo =x) > P(ta<at+b, TB < b| Xo = x) 


= \\P(ta sat, te <b, Xey =y|X0 =3) 


yeB 
= Yi P(ta <a +b| tp <b, Xry = y)P(tB <b, Xrg = y| Xo =x) 
yeB 
> YS) P(ta Sat te | te <b, Xry = y)P(te < b, Xry = y| Xo =x). 
yeB 


Applying the Markov property at time tg, we obtain 


P(ta <atb|Xo =x) = 


So P(t <a|Xo = y)P(tz <b, Xr, =y| Xo oa 
yeB 


> inf P(t < a| Xo = y)P(tB < b| Xo = %)., 
yeB 


and this is the inequality stated in the lemma. oO 


Chapter 17 ® | 
The Non-Neutral Phase M Pt 


The aim of this chapter is to study the dynamics of the process (Cn)nen when 
the master sequence is present in the population. We denote by M the set of the 
populations which contain at least one master sequence w”, i.e., 


M = {ce S”:c(0)>0}. 


Our interest lies in the proportion of master sequences C,,(0), and as we will see, in 
the non-neutral phase, asymptotically, the dynamics of C,,(0) depend very weakly 
on the other coordinates. We start by developing a large deviations principle for the 
transition probabilities of C,,(0). Indeed, if N,, is the number of master sequences 
present in the Wright—Fisher process at time n, we will show that asymptotically, 


1 1 
P(—Nnet = 2’| Ny = 2) ~ mV E29, (17.1) 
m m 


where the function V is non-negative, and takes the value 0 if and only if z’ = 
oe “z/((o — 1)z +1). Therefore, the proportion of master sequences (N,,/™)nen 
evolves as a random perturbation of the discrete dynamical system associated to the 


mapping a 
oe “z 


BS 
(0 -1)z+1 


Indeed, the probability of not following the trajectories of the dynamical system 
becomes exponentially small as m becomes large. In section 17.1, we give a rigorous 
counterpart to the statement (17.1) by proving a genuine large deviations principle. 
In section 17.2, we study the dynamical system associated to the mapping (17.2). 


z (17.2) 


Asymptotic regime. Throughout this chapter, we fix the parameters a,a, and we 
consider the asymptotic regime 


m, € — +00, q- 0, €q > a €]0, +o0[, Tae [0, +o0[ . 
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By asymptotically, we mean that m,¢ have to be sufficiently large, g sufficiently 
small, fq sufficiently close to a, and m/€ sufficiently close to a. For reasons of 
space, we do not treat the case a = +00, which requires different arguments. 


17.1 Large Deviations Principle 
Our goal here is to prove a rigorous large deviations principle corresponding to the 
statement (17.1). For p, t € [0,1], we define the quantity /(p, t) as follows: 
1-t 
inp’ 


t 
I(p,t) = th-—+(1-f)In 
P 


We make the convention that aln(a/b) = 0 if a = b = 0. The function /(p, -) 
is the rate function governing the large deviations of a binomial distribution with 
parameters n and p. We have the following estimate for the binomial coefficients. 


Lemma 17.1 Forn > 1 andO <i < n, we have 


in ttn it —din | < 2Inn 43. 
i!(n —i)! n n 
Proof Let us set, forn € N, 
d(n) = Inn!-—ninn+n. 


Comparing the discrete sum 


Inn! = by Ink 


‘ l<k<n 
to the integral i, In x dx, we see that 
Vn >= 1 1 < d(n) <Inn+2. (17.3) 
We have 
n! 
cea =ninn-n+t o(n) -_ (ilni -—it (i) 


— ((n- i) n(n - i) - (n-i) + O(n -i)) 
aot = @ on — See) =60) oe 0. 
n n 
Using the inequalities (17.3), we conclude that 
—2Inn-3 < g(n)- d(1) - b(n -i) < Inn, 


which gives the desired result. oO 
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We denote by S“*! the simplex of dimension ¢ + 1 defined by 
Sol = fe = (c(0),..-,e(0) € [0,1]: S* ef) = i, 
O<i< 


and by S“*! the subset of S“*! defined by 
Sit = Sil fal diet 
m m i 


Conditionally on C, = c € Seek the law of the random variable C,,4 (0) is the 
normalized binomial law + Bin(m, @o(c)), where the map ®o(c) is given by 


o¢(0)Mx(0,0) + Dicice C)Mu(i, 0) 


€+1 _ 
Tess Po(c) = oc(0) +1 —c(0) 


For c € Sé*! and t € [0,1] A N/m, we have 


ml! 


: Gren apyt (PCN) - Bo(c))"0, 


PG, aO) =1/C, =e) 
which we rewrite as 
In P(Cn41(0) = t|Cn =c) = —mI(®o(c), t) + E(e 1), (17.4) 


where &(c, t) is an error term. Thanks to lemma 17.1, this error term satisfies, for all 
m> 1, 
Vee Si! Vvre[0l]  |E(en| < 2Inm+3. (17.5) 


It follows that, for m > 1, 


Vee So! Vte [0,1] A N/m 
P(Cnsi(0) = t|Cn = €) < 27m? exp (= mI(®o(0),t))- (17.6) 


We define a function V : [0, 1] x [0, 1] — [0, co] by setting, for r,t € [0, 1], 


ae 


r 
Vi(r, t) = A r 


In the sequel of the section, by asymptotically, we mean when f, m go to infinity, g 
goes to 0, and €q goes to a (we don’t need here the condition that m/f — a). For 
instance, for r € S‘*! and t € [0, 1], asymptotically, we have 


I(®o(r), t) = Vir, t). 


We give next the large deviations principle for the transition probabilities of C,,(0). 
Before doing so, we introduce some notation in order to alleviate the upcoming 
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formulas. We denote by Z, = C,,(0) the proportion of master sequences in the n-th 
generation of the Wright—Fisher process. For A a subset of R, we denote by A the 
interior of A, by A its closure, and by A the set A N (N/m). We denote by I the set 
[0, 1] A CN/m) and by II the mapping from [0, 1] to I given by 


Vr é [0, 1] I(r) = ~ [rm . 


Proposition 17.2 The one step transition probabilities of the process (Zn)n>0 satisfy 
the large deviations principle governed by V,: 
e For any subset U of [0, 1] and for any r € [0,1], we have, for n > 0, 


~inf {Vi(r,1): 1 u} <  liminf din P(Zpat €U|Z, =I (r)). 
€,m>0, q>0 Mm 
eq-a 


e For any subsets U,U’ of [0,1], we have, for n = 0, 


1 = — 
limsup —Insup P(Z,41 € U' |Z, =z) < -inf {Vi(r,:reU,reu }. 
€,m—, q0 m zeU 
€q-oa 


Proof We begin by showing the large deviations upper bound. Let U,U’ be two 
subsets of [0, 1] and notice that, for all z ¢ and > 0, 


P(Zn41 €U'|Zy =z) S$ sup P(Zy41 € U" [Cy =). 


ce SE! c(0)=z 
Let c € S“*! be such that c(0) is in U. For n > 0, we have 
P(Zn1 €U"|Cn =e) = D1 P(Znet = 2"|Cn =o). 
z’eU’ 


Using the estimates (17.6) on the transition probabilities for the process (Z;,)n>0, we 
have, for m large enough, 


sup P(Zns1 € U"|C, =e) 
ce Sh! :c(0)eEU 
< (m+1) sup P Agi = 2 | Ci=<) 
ceShtlc()eU 


z/ EU’ 
4 ‘ , 
< m'exp| —m min I(®o(c), z’) | . 
ceShtlc()eU 
z/eU’ 


Define the mappings ®, ® : [0, 1] — [0, 1] by setting, for all r € [0, 1], 
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ro M7,(0, 0) roe roMy(0,0) + (1 —r)My(1, 0) 


ONS ee pee 1 (o-1)r+1 


We have, for all c in the unit simplex set, 
@(c(0)) < Bo(c) < P(c(0)). 
Define next the function J : [0, 1] x [0, 1] — [0, +o] by 


1- 


Vr,t € [0,1] Ur.) = tin = + 1) In ao 


The function J satisfies 
Vee S*! Vrel0,1] L(c(0),t) < M(Mo(c),t). 
Thus, 


sup P(Zn41 € U'|Cn =c) < mi exp (—mmin { I(z, z’) :z€U,z’ € u’} ) ; 
ceSft!-c(0)eU 


For each m > 1, let zm € U, z/,, € U’ be two numbers that realize the above minimum. 
We observe next the expression 


limsup —L(Zm; Zn) - 
£,m—>0, q0 
€q-a 


Up to the extraction of a subsequence, we can suppose that, when m — oo, 
= = 

Zm>reUu, zy47reUu. 

Moreover, asymptotically, for r,t € [0, 1], 
I(r,t) > Vi(r, 1). 

It follows that 

limsup —J(Zms%m) < —Vilr,t). 

€,m—00, q>0 
tq-a 


Taking the infimum of V;(r,t) over U x U’, we obtain the upper bound of the large 
deviations principle. We show next the lower bound. Let r,t € [0, 1]. For n > 0, we 
have 


Pi Zt =) | Ze = OO) = inf P(Zas1 = U¢)| Cn =e) 
ceSft!:e(0)=I(r) 


Let c € S“*! be such that c(0) = H(r). By inequalities (17.4) and (17.5), we have, 
for m large enough, 
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P(Zns1 =U(t)|Cy = c) = mB exp ( ~ m1 (®o(c), n1()) 
Define the function J : [0, 1] x [0,1] > [0, +00] by 


t —t 
Vr,t € [0, 1] iG =a sam ae 


The function / satisfies 
VcoeS*! Vref01]  Tc(0),t) > M(@o(c),t). 
Thus, 


ents Pz ni = I(t)|C, =e) = m2? exp ( - m7 (I(r), 11(4))) . 
céeS,7 :c(O)=T(r 


Moreover, asymptotically, 
7(I(r), H(t)) — Vi(r,t). 


We take the logarithm and we send m, to co and q to 0. We obtain 


1 
liminf —InP(Z,41 = T(t) |Z, =I(r)) > -V\(r,t). 
€,m—>0,q-0 Mm 
qa 


Moreover, if tf € U, for m large enough, I(t) belongs to U. Therefore, 


; lim inf ~ in P(Znss € U|Zn = T(r) > —inf {Vi(r, 1) ite U} . 
mo, q—0 
€q-a 


This is the desired large deviations lower bound. oO 


A similar proof shows that the /-step transition probabilities of (Z,,),>0 also satisfy 
a large deviations principle. For | > 2, we define a function V; on [0, 1] x [0, 1] as 
follows: 


I-1 


Vi(r,t) = inf { Vi(pk, Pk+1) : PO=T, pr=t, Px € [0,1] for O< k <1}. 
k=0 


Corollary 17.3 For | > 1, the l-step transition probabilities of (Zy)n>0 satisfy the 
large deviations principle governed by V;: 
e For any subset U of [0,1] and for any r € [0, 1], we have, forn = 0, 


—inf { Vi(r,t) it é€ U} < ; lim inf Ne d —In P(Zn4i € U|Zn =T(r)). 
m->0o, q>' 
tq-a 


17.2 Perturbed Dynamical System 135 


e For any subsets U,U’ of (0, 1], we have, for n = 0, 


1 = = 
lim sup — Insup P(Zn4+1 € UZ, = z) < —inf {Vi(r,1) :reu,teu } ; 
€,m—00, q0 m zeU 
tq->a 


We show next that the variational problems associated to the cost function V; are 
well posed. 
Proposition 17.4 For every pair of closed sets F,G C [0,1] and for every | > 1, the 
infimum 

inf { Vi(x, y) 2 x é€F,y eG} 
is attained. 


Proof Let F,G c [0, 1] be two closed sets and let / > 1. Denote by c the infimum 
in question: 


I-1 
c= int { YS) Vi(pe, Pes) > po € Fip1 € G, px € [0,1] forO sk < i}. 
k=0 


and suppose that c < oo (otherwise the result is immediate). By definition, there 
exists a sequence of finite sequences (p" = (0) )o<n<i,n € N) such that the quantity 


i-1 
Sy Vi(p),> Pha 


h=0 


converges to c when n goes to infinity. Up to the extraction of / + 1 subsequences, 
the sequence (p”),, en has a limit, p*. This limit satisfies 


I-1 


Vi(Pi» Pha) = ¢> 
h=0 


and since F, G and [0, 1] are closed sets, necessarily PO € F, Pi € Gand P;, € [0, 1] 
forO <h<l. o 


17.2 Perturbed Dynamical System 


Let us define the mapping ‘Y from [0, 1] to [0, 1] by 


oe “r 


Vr € [0,1] Wr) = (o—-Dr+1° 


We have the following equivalence: 


Vir, =0 @& t=WVr). 
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Likewise, the rate function V;(r, t) is equal to zero if and only if t = !(r). Therefore, 
the Markov chain (Z;,),>0 can be seen as a random perturbation of the discrete- 
time dynamical system associated to the mapping . This dynamical system has 
already been studied in section 10.1. Indeed, for every c ¢ S“*!, we have V(c(0)) = 
®,,(c)(0). We recall that the mapping 'V has at most two fixed points, 0 and 


(which is in the unit interval if ~e~“ > 1), and that for every r € ]0, 1], 


0 if oe% <1, 


lim ¥"(r) = 
ate ) Q(c, a)(0) if oe? >1. 


Moreover, for r € [0, 1], the derivative of VY is given by 


ae“ 


eo) = — =. 
v7) ((o — 1)r +1)? 


so that if me“ > 1, we have ¥’(Q(c, a)(0)) = 1/a~e~% < 1. Thus, the mapping 
is contracting in a neighborhood of Q(c, a)(0). In the next lemma, we control how 
the iterates of ‘¥ send compact subsets of ]0, 1] to the neighborhood of Q(c, a)(0). 
To alleviate the notation, we simply write Q(0) instead of Q(c, a)(0). 


Lemma 17.5 Suppose that ce~% > 1 and let 6 > 0. There exists anh = h(6) € N 
such that, for every r € [6,1], we have 


wr) €]Q(0) -— 6, Q(0) + d[ . 


Proof The mapping Y being increasing, for every r € [6, Q(0)], we have ¥"(r) > 
Y"(6). Likewise, for any r € [Q(0), 1], we have ‘¥"(r) < "(1). In view of theo- 
rem 10.1, there exist hs, hy € N such that 

Vha>hs — |¥"(5)- Q(0)| < 6, 

Va>h,  |¥"(1)-Q@)| < 6. 


It suffices to choose h = max(hg, h;). oO 


17.3 Time away from the Fixed Points 


This section is devoted to showing that the process (Z,,),>0 has a very small prob- 
ability of staying away from the fixed points of the mapping PV for a long time. We 
suppose throughout this section that ~e~* > 1. The meaning of “asymptotically” is 
recalled at the beginning of the chapter. 


17.4 Reaching the Quasispecies 137 


Lemma 17.6 Let 6 > 0. There exist constants h = h(6) € N and c = c(6) > 0 such 
that, asymptotically, 


Vr €]6,1] — P(|Z, — Q(0)| < 6| Z) = M(r)) = 1 -exp(-cm). 


Proof Let 6 > 0. Let us take h = h(6/2) from lemma 17.5. By the large deviations 
principle of corollary 17.3, 


1 
limsup —In sup P(|Zp_ — Q(0)| > 6| Zo = T(r) 
mo>~x,q>0 ™  re]6,1] 

lq-a 


< —inf { V(r, 0) ir €[6,1],¢ €JQ(O) — 6,Q(0) + d[ iz 


For every r € [6, 1], we have |¥"(r) — Q(0)| < 5/2. By proposition 17.4, the above 
infimum must be strictly positive. oO 


For 6 > 0, we define the set Us by 


Us = [0,6[ UV JQ(0) - 6,Q(0) + SL . 


Corollary 17.7 Let 6 > 0. Let ts be the hitting time of Us, i.e., 
w = min {@21:Z,€ Us}. 
There exist h(6) € N and c(6) > 0 such that, asymptotically, 


Yn>1 Vre]o,1] P(ts >n|Zo = T(r) < exp ( - me|=|). 


Proof Let 6 > 0. We first remark that, for any / > 1 and for any r € [0,1] \ Us, 
P(t5 >h| Zp =M(r)) < P(|Z, -— Q(O)| > 6|Z =r). 

We take h = h(6) from lemma 17.6 and we obtain that, asymptotically, 
Vr € [0,1] \ Us P(t5 >h|Zo =U(r)) < exp(-cm). 


Now that we have this uniform bound, the result of the corollary follows from 
lemma 16.5. Oo 


17.4 Reaching the Quasispecies 


We suppose again throughout this section that we“ > 1. The aim of this section 


is to compute a lower bound on a scenario leading to populations containing more 
than (Q(0) — €)m master sequences in a reasonable amount of time, when the initial 
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population contains very few master sequences. The meaning of “asymptotically” is 
recalled at the beginning of the chapter. 


Theorem 17.8 Let ¢ > 0. There exists a positive constant C such that, asymptotically, 
for every h > 1, 


P(Zicinm!| > Q(0) - é|Zo = h/m) 2 ae pemn : 

The delicate point is to show that, starting from any positive number of master 
sequences, it takes a time of order In m for the process to create 6m master sequences, 
for some small 5 > 0, and the probability for this to happen is at least of order m=". 
Once this point is settled, it remains to bound from below the probability of creating 
a finite number of master sequences starting from one master sequence, and to go 
from a positive proportion of master sequences to the neighborhood of Q(0). The first 
point is achieved via a direct computation. For the second point, we know that the 
trajectories of the dynamical system which start from [6, 1] reach the neighborhood 
of Q(0) in a finite number of iterations. Yet the large deviations principle ensures 
that, with high probability, the trajectories of the process are close to those of the 
dynamical system. 

Let us explain the idea for the proof of the delicate point, namely to bound from 
below the probability of creating 6m master sequences starting from one master 
sequence. The map ®(c(0)) is expanding in the neighborhood of c(0) = 0. So if we 
start with one master sequence, the number of master sequences should typically 
grow geometrically. However, when the number of master sequences is small, the 
law of large numbers is not applicable and the trajectory might fluctuate a lot around 
its mean. To avoid this problem, we first build a map F such that, for a fixed 
h = 1, asmall perturbation (Z,),>0 (say of order 1/m) of the deterministic trajectory 
(F"(h/m))n>0 is still increasing and exits the 6-neighborhood of 0 in less than C Inm 
steps, for some C > 0 independent of h. Next, we show that the probability for the 
process (Z,,),>0 to stay above the trajectory (z,),>0 is greater than (m+ 1)~-C"™, 

Before we proceed with this plan, we give a couple of useful lemmas concerning 
the binomial distribution. 


Lemma 17.9 Let n > 2 and 0 < p < 1. The binomial distribution Bin(n, p) is 
maximal at |(n + 1)p], and thus, if X ~ Bin(n, p), we have 


1 
P(X =[(n+ Dp|)2 — 


Proof Letk € {0,...,2— 1}, we have 


P(X=k+1) _ (n—k)p 
P(X=k) ~~ (k+1)(1-p)’ 


This last quantity is strictly larger than 1 if and only if k < np — (1 — p), thus, the 
quantity P(X = k) is maximal for k = |[(n + 1)p]. oO 
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Lemma 17.10 Let n > 2 and let0 < p < q < 1. Let X,Y be two random variables 
having distributions Bin(n, p) and Bin(n, q). Then 


Vk E{0,...,.n}  P(X>k) < PY>K). 


Proof Let U;,...,U, be independent random variables, with a uniform distribution 
on the interval [0, 1]. Then, forO < k <n, 


n n 


PIX Sk) = (Stein > < (Stun > = P(Y = k), 


i=l i=1 
as wanted. oO 


We proceed now to implement the scheme presented above for the proof of theo- 
rem 17.8. We define the mapping F : [0, 1] — [0, 1] by setting 


o M7,(0, 0)r 


Vr [0, 1] OS ip 


Note that, for any c € S“*!, we have ®(c)(0) > F(c(0)). Recall that we assume, 
throughout the whole section, that ce~* > 1. Suppose that 6 > 0 is small enough so 
that 


a(e* — 6) 
= — ——  >l, 
m= (= 18 +1 
and that m is large enough so that 
ee < mo. 
ag — 1 


Let h be an element of I greater than ag/m(as — 1) and define the sequence (Zn)n>0 
by setting z) = 4 and 


1 
in = F(Zn-1) ~~ n>1. 


Define Ns to be the first time when the sequence (z,,);,>0 is above 6, i.e., 
Ns = inf {n>0:z, 26}. 
We provide next an upper bound on Ng. 


Lemma 17.11 The sequence (Zn)o<n<Nz iS increasing, and there exists a positive 
constant C (which depends on 6 but not on m) such that Ns < C lnm for m large 
enough. 


Proof Let z be such that 
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For € large enough, g small enough, and €q close enough to a, we have My(0, 0) > 
e * —6, whence 


ere 


m(as — 1) im 


1 oMy(0,0) 1 
F(z)- 7 —z = (eee eS 


Thus, the sequence (z;,)o<n<n, is increasing. Moreover, for 0 <n < Ns, 


n-1 


1 1 
fn PS Osty a >see > (ag) 20 = — \\(as)' 
m ne 
(ax) Gal -1 . (@sl* 
m(as—1) mas —-1) m 
Yet, for n such that 
In(6m) 
ne ; 
In ag 
we have (as5)" > 6m, which proves the second statement of the lemma. oO 


The next step is to bound from below the probability for the process (Zn )o<n<N,s to 
stay above the trajectory (Zn)o<n<Ns- 


Lemma 17.12 For as/m(as — 1) < z < 6 and forn < N5, we have 


1 


P(Z, 2 Ziy.+45Zn = kn | Zo = zo) = (m+1)"' 


Proof As shown in lemma 17.9, the binomial law with parameters m and p is 
maximal at |(m + 1)p], and therefore, if X ~ Bin(m, p), the probability of the event 
{X = |mp|} is bounded from below by (m + 1)~!. The key to the proof of the 
current lemma is to show that, for any z > z,, conditionally on Z,, = z, the central 
term of the binomial law associated to the random variable Z,,,, is larger than mz, 41. 
Indeed, for 0 < n < N5, we have 


P22 tienda Zasn 2 ea | Ze ah) 
= » P(Zi > Zy00 Ln) 2 LnatsZn = z|Zo = zo) P(Zn+1 2 Zn+l1 |Zn = z) : 
zel:z>Zn 


Fix z € I to be greater than or equal to z, and let c € S{*! be such that c(0) = z. 
Conditionally on the event { Z,, = z}, the distribution of the random variable Z,,41 
is the normalized binomial law - Bin(m, @®o(c)). By the above remarks, 


1 
m+1— 


|m®o(c) | 2 MZn+1 > P(Zn+1 2 Zn+1 | Zn = z) = 


Yet, since ®(c)(0) > F(c(0)) for every c € S“*!, 


| m@(c)(0)| => |mF(z)| => mF(z)-1, 
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and since F is increasing, this last quantity is greater than or equal to 
mF (Zn) — 1 = mZns41, 


as wanted. Thus, 


PZ 2 fies Ze = eat |Ze= eo) = (Ai eit Se |e aa 


P 
m+1 
Iterating this inequality, we get the result of the lemma. oO 
We combine the two previous lemmas in order to obtain the following corollary. 


Corollary 17.13 There exists a positive constant C (which depends on 6 but not on 
m) such that, for every positive h € I, for m large enough, 


P(Z cw! = 6| Zo = h) > (m+ {on™ 


Proof The result has been proved for h greater than as/m(ags — 1) in the course of 
the proof of lemma 17.11, so it remains to show that it is still true if the initial point 
h is positive and smaller that as/m(as — 1). Note that, for any c € Sé*! such that 
c(0) = 1/m, for m large enough, 


oMy(0,0)c(0) : Mpx(0,0) 


Mid) = aoa Om 


Take X to be a random variable with distribution Bin(m, My (0, 0)/(2m)) . Applying 
lemma 17.10, for any c € S{*! such that 1/m < c(0) < as/m(as — 1), we have 


P(Z, = —*— |q =c) > P(X > es ie 
m(as — 1) as —1 


Set i = [as5/(as — 1)]. Then, the last probability is bounded from below by 


m! (OO - se 


il(m — i)! 2m 2m 

(CIMA) _ MarO,0)yr+ 

= 2im 2m : 
We conclude that 

liminf | P(Z; = —“* —|cy =c) > 0. 
€,m—>, q>0 m(as = 1) 
€q--a 

A simple conditioning completes the proof. oO 


Theorem 17.8 follows from lemma 17.6 and corollary 17.13. 
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17.5 Escape from the Quasispecies 


The aim of this section is to estimate the probability that the proportion of master 
sequences becomes smaller than 6, for 6 > 0. More precisely, let us define the 
stopping times ts and Te(o)-5 by 


T = inf {n >1:Z,< o}, TQ(0)-6 = inf {n >1:Z, > Q(0)- 6}. 
We define the function wy by setting, for a € ]0, +oo[ 


Wa) = inf V;(Q(0),0). 


We have the following large deviations estimate. 


Lemma 17.14 Let ¢ > 0. For 6 small enough, and for z € [Q(0) — 6,1] NL, 
P(t. < TQ(0)-6 | Zo = z) < exp (—m(W(a) - é)) . 


Proof If oe~* < 1, then w(a) = 0, and the result is obvious. Let us suppose that 
oe * > 1. Let 6 > O and let z € [Q(0) — 6, 1] NI. We have, for any integer k > 2, 


P(t < TQ(0)-6 | Zo = z) 
= Pes ..+sZn € 16, Q(0) — 6, Zn41 < 5| Zo =z) 


n>1 


= > P(Z1,..-,Zn €]5,Q(0) — 6[, Zn+1 < 6| Zo =z) 


l<n<k 


te NP Live eta Zn € ]6, Q(0) — 6[, Zns1 < 6| Zo =z). 


n>k 


Let h and c be associated to 6 as in corollary 17.7. Then, for any n > 1, 
P(Zi,..-sZn € 16, Q(0) - 5[, Zns1 $ 5|Zo =z) < exp ( — me| >| Je 


On one hand, summing over n > k in the above inequality, we obtain, for m large 
enough, 


DP Geeta € ]6,Q(0) — 6[, Zns1 < 5| Zo = z) 


n>k 
< dexp (- me(Z -1)) < 2exp(-“"(k-A)). 


na 


We choose k large enough so that the above quantity is smaller than e7”"\“()-®) /2, 
On the other hand, we have, thanks to the large deviations principle of corollary 17.3: 
for any ¢ > 0, for m large enough, 
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» P(Zi,..-,Zn € 16, Q(O) — 6[, Zns1 < 6| Zo =z) 
l<n<k 
< > P(Zn+1 < 6|Zo =z) < 
l<n<k 
exp (mint { Vasi(rst) : 7 = Q(0)- 6,1 < 5} + me/2). 


l<n<k 


It remains to show that, for 6 small enough, for 1 <n < k, 
inf {V,(r,t): r > Q(0)-6,t< 6} > Wa)-e/2. 


Indeed, by definition of V,, we have 


inf {V,(r,t) 7 > Q(0) -6,t < 5} 


-a 


HE 


n-1 
as oe “pK 
= ot | So eee * pe € [0,1] forO<k <n 


Note that, for any t < Q(0) - 6, 


; oe ’r G 
min (—_, ') = min /(¥(r),t) 
Qo)-d<r<1l \(o-1)r+1 Q(0)-6<r<l 


= I(¥(Q(0) - 6),t) = Vi(Q(0) - 6,1). 


These equalities are consequences of the following facts: the function I(p,f) is 
increasing in p for t < p, the mapping 'V is increasing, and 


P(Q(0) — 6) > Q(0)-6 >t. 


Let (py )o<k <n be a sequence realizing the infimum above, and let k* be any of the 
indices such that p?._, = Q(0) — 6 and pz, < Q(0) — 6. Then, 


n-1 


n-1 
Vi(ee Pra) = VilQ(0) - 5, pp.) + >) Vee Pea) 
k=0 k=k* 


= (a) — Vi(Q(0), Q(0) — 6) — Vi(6, 0), 
which for 6 small enough is greater than w(a) — €/2, as wanted. oO 


We end this section by giving a lower bound for the probability of reaching 0 in a 
number of steps of order In m. Let us define the hitting time of 0 by 


7. = inf {n>0<Z,=0}. 
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Lemma 17.15 Let € > 0. There exists a C(€) > 0 such that, for every z € 1\ {0}, 
we have, for m large enough, 


P(t < C(e)Inm| Zo =z) = exp (-m(W(a) + )). 


Proof Lete,6 > Oand suppose first that z € Lis greater than 6. Then, by lemma 17.5, 
there exists an h(5) € N such that ‘"(z) is in the 5-neighborhood of Q(0). Pick a 
sequence (px )o<k<N(e) Such that 


N(e)-1 


po= QO), prey =9, Wa > >) ViPe pes) 2/2. 
k=0 


Define next a sequence (7x)o<k<N(«)+h(6) by concatenating the sequence of iterates 
of z by V and the sequence (px). We have then 


h(6)+N(e)-1 
P(t < (5) + N(e)| Zo =z) = I] P(Zea1 = Wqe+1) | Ze = Tm) - 
k=0 


We use the estimates on the transition probabilities of Z,, from the proof of the large 
deviations principle 17.2 in order to obtain 


lim ~inP(n < h(5) + N(e)|Zo =z) = -W(a) - e/2. 


Suppose next that z < 6. By theorem 17.8, there exists a positive constant C such 
that, asymptotically, 


1 
P(Zicinm| > 6| Zp =z) Pas (m+ 1)Clnm 


Therefore, for z < 6, and choosing C(é) such that 
Clnm+h(6)+ N(e) < C(e)Inm, 
we have, for m large enough, 


P(t < C(e) Inm| Zo =z) 
2 >, P(Zicinm = 2’ |Zo = z)P(t0 < h(6) + N(e)| Zo =z’) 


zZ’>6 


xp (—m((a) + €/2)), 


= ———eé 
— (m+ 1)CInm 


which asymptotically is greater than exp(—m(W(a) + €)), as wanted. Oo 


Chapter 18 ® | 
Mutation Dynamics oo 


In this chapter, we focus on the dynamics of a single individual undergoing mutation 
in the absence of the selection mechanism. Thus we consider the mutant chain 
(Wn)nso of section 8.2, i.e., the Markov chain with state space A‘, having for its 
transition matrix the mutation matrix M. We are mainly interested in comparing 
the individual at time m with the master sequence, so in section 18.1 we introduce 
an auxiliary process to keep track of their differences. In section 18.2, we study 
the induced dynamics on the Hamming classes. The remaining sections 18.3, 18.4, 
18.5 are devoted to the derivation of asymptotic estimates on this process. In these 
sections, we fix the parameter a, and we consider the asymptotic regime 


€ > +00, q- 0, €q > a €]0,+00[ . 


By asymptotically, we mean that ¢ has to be sufficiently large, gq sufficiently small 
and €q sufficiently close to a. 


18.1 Binary Process of Differences 


We define a process (Vj)n>0 on { 0, 1 }° by setting 


: Re eps 
We lvuty vaya? “eo: 
1 ifW,@ #w*(). 
The binary word V,, indicates the sites where W,, and w* differ. In particular, the ran- 
dom variable V,, is a deterministic function of W,,. In our models, the mutations occur 
independently at each site. An important consequence of this structural assumption 
is that the components of W,,, (W,,(i), | < i < €), are themselves independent Markov 
chains with state space A and transition matrix 
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a ae 
-—1 k-1 
q lq q 
K-1 K-1 
: : ; : 
tae 1- 
k-1 k-1 q 


The non-diagonal terms in this matrix are all equal. Applying the lumping theo- 
rem A.3, we observe that each component V,,(i) is still a Markov chain, in fact it is 
the two-state Markov chain that we define and study next. Let (Z,,),>0 be the Markov 
chain with state space {0, 1} and transition matrix 


l-q  4q 
T=| 4 ,_ 4 |. 


The eigenvalues of T are 1 and 


We compute, forn > 1, 


Here is a simple illuminating way to realize the Markov chain (E;,),>0 and to 
understand the expression of the n-th power 7”. Think of (E,)n>0 as being the 
Markov chain (V,,(1))ns0. Let (€n)n>1 be an i.i.d. sequence of Bernoulli random 
variables with parameter 2. Suppose that E,-; = e € {0,1}. If e, = 1, then we set 
E, = e. If &, = 0, then we first choose a letter uniformly over A, and then we set 
E,, = O if this letter agrees with w*(1) and we set E,, = 1 otherwise, independently 
of the past history until time 1. Now, the event E,, = Ep can occur in two different 
ways. Either €¢; = --- = €, = 1, or one of the &€,..., €, is zero, in which case the 
distribution of E, is uniform over A, thus 


1 
P(E, = 0|Ep = 0) = A" + (1-a")-, 
K 
1 
P(E, =1|£o = 1) = an + (1-a")(1- =), 
K 
and we recover the expression of the diagonal coefficients of T”. Similarly, the event 


E,, # Eo can occur only if one of the €),..., €, is zero, and the last mutation event 
yields the adequate letter, thus 
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P(E, = 1|Eo = 0) = (1-a")(1--), 
1 
. 


P(E, =0| Eo = 1) = (1-2") 


From these computations, we see that E,, is a Bernoulli random variable whose 
parameter is one of the two numbers 


= (1-2)(1-2") or Bn = 1-24 cat, (18.1) 


More precisely, if Ey = 0, then E, is a Bernoulli random variable with parameter 
ay; if Eo = 1, then E,, is a Bernoulli random variable with parameter £,,. Therefore 
the random variable E,, is stochastically bounded by two Bernoulli random variables 
as follows: 

Ber(a,) < En X Ber(B,). (18.2) 


Coming back to the process (V,,)n>0, the same conclusions hold component wise. 
Moreover the evolution of the components is independent. This will yield efficient 
bounds, that we develop in the next section. 


18.2 Hamming Class Dynamics 


We study here the dynamics on the Hamming classes induced by the process (Wy) n>0- 
More precisely, we are interested in the process 


Y, = » V,,(i) . 
1l<i<f 


From the conclusions of the previous section, if we start with Yo = k, then 
Y, ~ Bin(€ - k,a,) + Bin(k, By), (18.3) 


where the two binomials are independent. From this formula, or from inequal- 
ity (18.2), we have the following stochastic bounds: uniformly over the starting point 
XY, 

Bin(€,an) < )\ Valé) < Bin( Bn). (18.4) 


1<i<€ 


The parameters a, and 6, converge towards | — 1/x at speed 2”. Now 


yY = (1 - Kd ‘: (18.5) 


and q is of order 1/€, therefore the binomial laws Bin(é,a@,,) and Bin(@, 8,,) are 
polynomially close to their limit Bin(€, 1 — 1/«) when n is of order €1n €. Another 
application of the lumping theorem A.3 yields that (Y,)n>0 is a Markov chain with 
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state space {0,...,€} having for its transition matrix the lumped mutation matrix 
My, defined before lemma 6.1, which we recall next: for b,c € {0,...,€}, we have 


My(b,c) = P(Y-X =c-b), 
where X,Y are independent and X ~ Bin(b,q/x — 1), Y ~ Bin(¢ — b, q). 


Proposition 18.1 The matrix My is reversible with respect to the binomial law 
Bin(€, 1 — 1/x) with parameters € and 1 — 1/x. This binomial law is the invariant 
probability measure of the Markov chain (Yn)n>0- 


Proof We denote simply by 8 the binomial law Bin(é,1 — 1/x). Note that the 
Markov chain E,, is reversible with respect to the Bernoulli law Ber(1 — 1/x), and so 
are each of the f independent Markov chains V,(i), 1 < i < €. Take (Vo(i))1<;<¢ to 
be independent and identically distributed random variables with law Ber(1 — 1/x). 
We check that the matrix My is reversible with respect to 8. Let b,c € {0,...,€}, 
we have: 


a Vo(i) = 7 


B(b)My(b,c) = o| voli) = o}( Y U@=c 


l<i<f l<i<f 1<i< 
= | yy Vo(i) = b, oy n=¢ = » I] P(Vo(i) = 6, Vili) = £'), 


l<i< lsi<é l<si<€ 


where the sum is over all the pairs of sequences (€;)1<i<¢,(€})i<i<e € {0,1 }¢ such 
that )); 6; = b and 3); €; = c. Since the Markov chains (V,(7))n>0, 1 <i < €, are 
reversible with respect to the law Ber(1 — 1/x), we obtain 


B(b)Mn(b,c) = )\ | | P(Voli) = ef, Vi@ =a) 


l<i<f 
= | 2 Voli) =c, > vc = 9) = B(c)My(c,b). 
1<i<f 1<i<€ 


We obtain the same expression as before, but with b and c exchanged. Thus the 
matrix M7 is reversible with respect to 8 and 8 is the invariant probability measure 
of the Markov chain (Y;,)n>0. oO 


When f grows, the distribution Bin(¢, 1 — 1/x) concentrates exponentially fast in a 
neighborhood of its mean 
€ = €1—-1/k). 


We estimate next the probability of the points to the left of ¢,. 
Lemma 18.2 We denote by 8 the binomial law Bin(€,1 - 1/x). For b < €/2, we 


have ; 
1/€ e° 
at) 2 ig re 
a (55) < Bib) < 
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Proof Let b < ¢/2. Then 


_(€ 1\o/1\e-5 _ (e\1 _ (€-b)? 1 ae 
Bi = (i)(1- 5) () 2 BE = ee (55) xe 


The upper bound on &(d) is straightforward. Oo 


18.3 Time away from the Equilibrium 


We give first a rough estimate for the probability of reaching a neighborhood of ¢, 
in a time €1n€. The meaning of “asymptotically” is recalled at the beginning of the 
chapter. 


Proposition 18.3 Let 6 > 0. There exists a positive constant c depending on 6 such 
that, asymptotically, 


Vie {1,...,é} P(Yene| > €,(1 — 6) |X% =i) > 1-exp(-cf). 


Proof We use the stochastic inequalities (18.4). Let us set n = |€In¢]. Let X bea 
random variable with distribution Bin(€, a). We have, for any i € {1,...,€}, 


P(¥Y, = &(1 — 6) |¥% =i) = P(X >E&(1-6)) = 1- P(X < &(1-6)). (18.6) 


From formulas (18.1) and (18.5), we have 


Gy = ieoSG=2 hq «4 \ 


K K k-1 


Sending @ to ov, and recalling that n = |€1n¢], we have 


lima, = 1--, 
£00 K 
and this implies that fa, > €,(1 — 6/2) for € large enough. Recall that fa, is the 
expected value of X. We can therefore use Hoeffding’s inequality (see theorem A.4) 
to control the last probability in formula (18.6) and we obtain 


1 2 
P(X < &(1—6)) < exp | - 2(ay - (I= =)(1 = 6)] j 
K 
K-16)? 
< exp| -2(——>) ¢], 
exp | =o | 
where of course the last inequality holds for large enough. Oo 


For 6 > 0, we define the hitting time t,,5 by 
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K,5 = inf {n >1:Y,2 (1 —6)}. 


The estimate of proposition 18.3 yields the following estimate on T,.,6. 


Corollary 18.4 Let 6 > 0. There exists a c = c(6) > 0 such that, asymptotically, 


n 
Vyo €{1,...,€} P(t%,5 >|Yo = yo) < exp(-ct| =|). 


Proof Let c be associated to 6 as in proposition 18.3. Asymptotically, we have 
Vie{l,...,€} P(t%,6 > [€In€||¥ =i) < exp(-cé). 


Once we have this uniform bound, the result of the corollary follows from 
lemma 16.5. oO 


18.4 Reaching the Equilibrium 


We will need further estimates on the speed at which the process Y,, reaches a 
neighborhood of ¢,. The meaning of “asymptotically” is recalled at the beginning of 
the chapter. We estimate first the speed at which Y,, goes away from 0. 


Proposition 18.5 Asymptotically, we have 


3 I 
Yn > —(In€)(InIn @) P(Y¥, = In€\|Y%=0) > 1-—. 
a Int 
Proof We use the stochastic inequalities (18.4). Let X be a random variable with 


distribution Bin(@, a,,). We have 
P(Y, => In€|Y% =0) => P(X =Iné). 


To obtain the relevant estimate, we shall use a simple block argument. The distribution 
Bin(€,a@,) can be realized as the sum of k independent random variables with 
distribution Bin(¢/k, a,,). We apply this idea with k = In@ and we observe that 
X > k occurs if the k random variables are non-zero. Let Y be a random variable 
with distribution Bin(¢/In @, a,,). We get 


In€ 


P(X 2 Ind) > (PV 21)" = (1-day!) (18.7) 


From formulas (18.1) and (18.5), we have 


1 1 n 1 1 
I= a, = —+(1- )(1- “| < +(1- ) exp(- 4). 
K K K-1 K K K-1 


This implies that, for n > (In €)(1n In €), 
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1 1 
l-a, < -—+ (1 - -| exp ( - 4 5 Anginind)) 
K K K-la 
3 3 
= 1~5(In (In in €) + o(F(in én In 0), 


whence 
(1 —an)’/™ < exp(—3InIn¢+o(InIng)). 


Substituting this into inequality (18.7) and performing an asymptotic expansion, we 
obtain that 


In€ 
P(X >In) > (1-exp(—3lnin¢+o(Inind)) » 


1 
= _- — + a F 
e ( (in é a (in é ) 
and this provides the desired estimate. oO 


We finally give a refinement of proposition 18.3 by considering a smaller neighbor- 
hood of €,. 


Proposition 18.6 Asymptotically, we have 
1 1 
Vn => —€In€ P(%n 2 & — VElné|¥o = 0) 2 1-5. 
a 


Proof We use the stochastic inequalities (18.4). Let X be a random variable with 
distribution Bin(¢, a), we have 


P(Yn = & — VEln€|% =0) > P(X = & — Vélné) 
= 1-P(X <é,-VéIné). (18.8) 


From formulas (18.1) and (18.5), we have 


Oy = ie alias) (1 - ag \ 
K 


K 


In the regime where €g — a and n > +¢In €, we see that 
1 ake 
a = 1-—+0(0 m1), 
K 
and this implies that, asymptotically, 


tan > € —VElne. 


Recall that €a,, is the expected value of X. We can therefore use Hoeffding’s inequal- 
ity (see theorem A.4) to control the last probability in formula (18.8). Asymptotically, 
we have, for n > +¢in e, 
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4 2 
P(X <&-—Vélné) < on( -200- aoe j 


= exp(-2In€+o(Iné)) < - 


where of course the last inequality holds for ¢ large enough. Oo 


18.5 Escape from the Equilibrium 


The mutation dynamics escapes the neutral equilibrium when it recovers the master 
sequence, that is when the process Y,, on the Hamming classes hits 0. So we define 


t= it {aS ley, =U 


and we seek here an upper bound on the expectation of to. To do so, we estimate 
the probability of hitting 0 in a fixed time interval. We look for an estimate which is 
uniform with respect to the starting point, with the help of the stochastic inequali- 
ties (18.4). Let us fix n > 1. We have, for any yo € {1,...,€}, 


P(t <n|¥ = yo) = Pn =0|¥ = yo) = P(Bin(E, Bn) =0) = (1- Bn), 


thus, by lemma 16.5, 
n 
Eta | to= 0) S = — aps 
(1 a Br) 


Replacing £, by its value (18.1), we obtain that 


0-25) 


In our regime, the parameter g is of order 1/€, so to have the correct exponential order 
with this inequality, we shall take n of order € In £. The meaning of “asymptotically” 
is recalled at the beginning of the chapter. 


Vy €{1,...,€} Wn>1 E(t |¥% = yo) < 


Proposition 18.7 Asymptotically, we have 
2 é 
Vyo € {1,...,€} E(to|Yo = yo) < —(€ln€)k’ . 
a 


Proof We apply formula (18.9) with n = +(e In €). We expand successively 


n In€ 
(1- ) 7 exp(- "+ o(ind)), 
k-1 k-1 
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en[1-(1- At) ~ —Cexp (- <A" + o(in0)), 


whence 


This limit and inequality (18.9) with n = AC In €) together imply that the inequality 
stated in the proposition holds asymptotically. oO 


We need also to control the probability that the process Y, exits a neighborhood of 
€, and reaches a neighborhood of 0 without coming back to the neighborhood of f,. 
For 6 > 0, we consider again the hitting times To and t,,5 given by 


T| = int {e> 127, =0), TK,6 = inf{n>1:¥,>&(1-6)}. 


Proposition 18.8 For any € > 0, there exists a 5 > 0 such that, asymptotically, 


1 
Vk>&(1-5) P(t <t,5|Yo=k) < que’ 


Proof Let k > €,(1 — 6). Let 6 > 0 and let c > 0 be a constant associated with 6 as 
in corollary 18.4. We write, forn > 1, 
P(t < 7,5 | Yo =f) 2 Plta> n|Yo =k) + P(t < n|Yo =k) 


n 


< exp ( - e¢| 


n-1| 
|) +) P(%, =0|% =k). (18.10) 
t=1 
Let us focus on the last probability. From the identity (18.3) and formula (18.1), we 
have 
P(Y, =0|¥% =k) = (1-a,)*(1 - B,) 
1 (-k k 
all +(k-1)a') “(1-a'y. 


The previous quantity decreases with k, therefore, for k > €,(1 — 6), we have 


1 Letic - 
P(X, =0|¥ =k) < G1 +k- pare eee 


1 1 
< <7 exp ((E~ &e(1 ~ 8) («= Dat ~ (1 ~ 6)a‘) < exp (kéd). 


Substituting this into inequality (18.10), we obtain 
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n 


P(t < 7,5 | Yo = k) < exp (- et] 


|) + “exp (k€6) . 


We choose n = Cf In @, where C is a positive constant. By choosing C large enough, 
say such that cC > 21nx and 6 small enough, we obtain the desired statement. oO 


Chapter 19 ® | 
The Neutral Phase V on 


The aim of this chapter is to study the dynamics of the process (Cn)nen when 
there is no master sequence present in the population. We denote by WN the set 
of the populations which do not contain the master sequence w* and by M its 
complementary set, i.e., 


N= (A‘\ {w" i): M= (a‘)"\. 


Since we are dealing with the sharp peak landscape, the transition mechanism of 
the process restricted to the set NV is neutral. We consider a Wright—Fisher process 
(Xn)n>o0 Starting from a population of NV. We wish to evaluate the first time when a 
master sequence appears in the population: 


™ = inf {n>1:X,eM}. 


We call the time t,, the discovery time. Until the time tT), the process evolves in 
N and the dynamics of the Wright—Fisher model in N does not depend on a. In 
particular, the dynamics of the process until the discovery time T,y is the same for 
the Wright-Fisher model with o > 1 and for the neutral Wright—Fisher model with 
o = 1. In this chapter, we will estimate various events which occur before the time 
ty. Therefore, we compute the estimates for the latter model. 


Neutral hypothesis. Throughout this chapter, we suppose that 7 = 1. 


Asymptotic regime. Throughout this chapter, we fix also the parameters a, a, and 
we consider the asymptotic regime 


m, € — +00, q-0, €q > a €]0, +oo[, cae [0, +o0[ . 


By asymptotically, we mean that m,f have to be sufficiently large, g sufficiently 
small, fq sufficiently close to a, and m/€ sufficiently close to a. For reasons of 
space, we do not treat the case a = +00, which requires different arguments. 
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19.1 Ancestral Lines 


Let us define an ancestral line associated to a trajectory of the Wright—Fisher process. 
Fori € {1,...,m}andn > 1, we denote by (i, n,n — 1) the index of the ancestor 
at time n — 1 of the i-th individual at time n. More precisely, if the i-th individual of 
the population at time n has been obtained by replicating the j-th individual of the 
population at time n — 1, then L(i,n,n — 1) = j. For s <n, the index J(i,n, s) of the 
ancestor at time s of the i-th individual at time n is then defined recursively with the 
help of the following formula: 


T(i,n,s) = [(Li,nn-1),n-1,s). 


We define also (i, n,n) = i. The ancestor at time s of the i-th individual at time n is 
the individual 
ancestor(i,n, 5) = X;(Z(i,n,s)). 


The ancestral line of the i-th individual at time n is the sequence of its ancestors until 
time 0, 
(ancestor(i,n, s),O0<s <n) = (X;(L(i,n,5)),0O< 5 <n). 


Notation. For b € {0,...,€}, we denote by (b)” the column vector whose compo- 
nents are all equal to b. 


Proposition 19.1 Let b € {0,...,€} and let (X;)ys0 be the neutral Wright—Fisher 
process starting from (b)™". Let i € {1,...,m}. For any n > 0, the law of the 
ancestral line (ancestor(i,n, 5), 0 < s < n) of the i-th individual of X,, is equal to the 
law of (Yo, ..., Yn) starting from b. 


The proof is standard. One can proceed by induction as in [15]. In fact, the ancestral 
lines of the individuals at time n are given by a coalescent process. Along an ancestral 
line, an individual moves according to the mutation dynamics given by the matrix 
Mu. 


19.2 Monotonicity and Correlations 


We consider here the product space {0,...,€}’" equipped with the natural product 
order: 
d<e c= Vie{l,...,m} di) < ei). 


We define the distance process (Dy)n>09 on {0,...,€}” by 
Yn>0 Vie{l,...,m} D,(i) = H(X,@). 


The Markov chain (X;,)n>0 is lumpable with respect to the Hamming classes, so that 
the distance process (D,,)n>0 is a genuine Markov chain. It is furthermore possible 
to couple the neutral distance process starting from two different populations d and 
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e which are comparable, say d < e, in such a way that the process starting from d 
is always smaller than the process starting from e. This construction is quite heavy 
and it is explicitly carried out in [16]. As a consequence, the neutral distance process 
(Dy)n>0 is monotone (see definition A.2 in the appendix), i.e., for any non-decreasing 
function f on {0,...,é€}', the map 


dé {0,...,€}" + E(f(Dn)| Do = d) 


is itself non-decreasing. 


Definition 19.2 A probability measure ys on {0,...,¢}” is said to have positive 


correlations if for any functions f, g : {0,...,€}’" — R which are non-decreasing, 
we have 
YY s@e@ua>=( Y s@ua)( > s@ua). 
de{0,...,€}™ de{0,...,€ }" de{0,...,€ }" 


The Harris inequality, or the FKG inequality in this context, says that any product 
probability measure on {0,...,€}’" has positive correlations. The FKG inequality 
is in fact true for any product probability measure on a product of the interval [0, 1] 
(see section 2.2 of Grimmett’s book [43]). As far as correlations are concerned, there 
is not much to do with the original Wright—Fisher model, because its state space is 
not partially ordered. So we examine the distance process. 


Proposition 19.3 Suppose that we are in the neutral case 0 = 1. If the law of Do has 
positive correlations, then for any n > 0, the law of Dy has positive correlations. 


Proof The Wright—-Fisher model (X;,)n>0 can be seen as a probabilistic cellu- 
lar automaton. Indeed, given the population X, = x at time n, the individuals 
(Xn+i@, 1 < i < m) of the population at time n + 1 are independent. This still 
holds for the distance process. Moreover, the neutral distance process (Dy)n>0 is 
monotone. Monotone probabilistic cellular automata preserve the FKG inequality. 
This is explained in detail by Mezi¢é [63] and it was first observed by Harris [44] at 
the very end of his article on continuous time processes. Because the argument is 
very short, we reproduce it here. Suppose that the initial law ys of Do has positive 


correlations. Let f,g : {0,...,€}” —> R be two non-decreasing functions. For 
any d € {0,...,€}’", the conditional law of D; knowing that Do = d is a product 
measure on {0,...,€}'”, thus it satisfies the FKG inequality, whence 


Va-e10,...,2)" 
E(f(Di)g(D1) | Do = d) = E(f(D1)| Do = 4) E(g(D1) | Do = 4) . 


We integrate the inequality with respect to the initial law wu: 
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YS) E(f(Di)g(D1)| Do = 4) wld) = 


de{0,...,€ }” 


>, Elf(D1)| Do = 4)E(g(D1) | Do = 4) ud). 
de{0,...,€}™ 


Since (Dy,)n>0 is monotone, the maps 


de {0,...,€}"  E(f(D1)| Do = 4), 
dé {0,...,€}" + E(g(D1)| Do = 4d), 


are non-decreasing. By hypothesis, the initial law yz has positive correlations, there- 
fore 


>, Elf(D1)| Do = 4)E(g(D1)| Do = 4) ud) > 


dé{0,...,€ }™ 
Dd Evo = 4) H@))(— YY)  E(e(D1) | Do = a) wa). 
de{0,..,€}™ de{0,..,£}™ 


The two above inequalities imply that the law of D, has positive correlations. We 
conclude by iterating the argument. oO 


19.3 Time away from the Disorder 


This section is devoted to showing that the process (D,,)n>0 has a very small proba- 
bility of staying away from 0 and €, = €(1 — 1/x). We will show that it is very likely 
to enter in a short time the set N* defined by 


N* 


{ce S®:cli)=0,0<i<t—Vvélne} 
= {de{0,...,€}":d>(G&—Velng)”}. 


As usual, we define the associated hitting time: 
Ty* = inf {n>1:D,€N*}. 


We first prove a general lower bound on the probability of entering a set of the 
form { b,..., €}’". Notice that this bound is only valid for the neutral Wright—Fisher 
process, when o = 1. Indeed, it is based on the monotonicity and the correlation 
inequalities, which do not hold when the fitness landscape is not selectively neutral. 
We shall make use of this bound to estimate the entrance time T,y- under the condition 
that the hitting time of the master sequence Ty is larger than Ty. This is legitimate 
because, as said earlier, until time tT» , the dynamics of the Wright—Fisher process 
is selectively neutral. 
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Lemma 19.4 For any b,c € {1,...,€} andd € {0,...,€}™ such that d > (cy, 
we have 


Ya >1 — P(Dy > (b)"|Do = 4) > (P(m = b|¥% =c)) 


Proof Let d € {0,...,€}” andc e€ {1,...,€} such that d > (c)”. Since the 
process (D;,)n>0 is monotone, 


P(Dy = (b)”" | Do = d) = P(Dn = (db) | Do = (c)””) - 


By proposition 19.3, the distance process starting from (c)” has positive correlations, 
therefore 


P(Dn = (b)"|Do =(c)") = [| | P(Dn@ = b| Do = (©)"). 


l<i<m 


By proposition 19.1, the ancestral line of any individual present at time n has the 
same law as Yo,..., Y,,, therefore 


Vie{l,....m}  P(Dp(@) = b| Do = (c)”) = P(Y, = b|YO =c). 
Putting these inequalities together, we obtain the claim of the lemma. oO 


With the help of the previous lemma, we derive next a bound on the law of ty. The 
meaning of “asymptotically” is recalled at the beginning of the chapter. 


Lemma 19.5 Asymptotically, we have 
1 1\m 
vd € {0,...,€}” P(t: < —£In€|Do = d) > (1 = ;| 
a 


Proof Let us set 
1 
N=-€lIné. 
a 


By the monotonicity of the process (Dy)n>0, we have 
vd € {0,...,€}" P(Dw EN*|Do = d) > P(Dw € N*|Do = oy") 
We now apply lemma 19.4 with b = ¢, — VéIn@, k = 0 and we get 
P(Dw € N*|Do = (0y") > (Pow > ¢ — Veiné|% = 0)". 
Proposition 18.6 yields that 
P(Y¥n > & — V€ln€|% =0) > 1-5. 


Putting these inequalities together, we obtain the claim of the proposition. oO 
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Once we have a lower bound on the probability of entering the set N* which is 
uniform with respect to the starting point, we apply it on each interval to get a 
geometric bound. 


Corollary 19.6 Asymptotically, we have 


Vd €{0,...,0}" P( tw: > n|Do = 4) < (1- (1-2 


Proof Lemma 19.5 gives that, asymptotically, 


Vd €{0,...,2}" — P(twe > [=¢1n || Do = d) z 1-(1-5)”. 


Once we have this uniform bound, the result of the corollary follows from 
lemma 16.5. oO 


19.4 Reaching the Disorder 


The aim of this section is to compute a lower bound on the probability of reaching 
the equilibrium in the neutral phase in a reasonable amount of time, when the initial 
population does not contain any master sequence. More precisely, we will show that, 
starting from any point in the neutral phase, it takes a time of order €Inf to reach 
a neighborhood of the equilibrium. We are mainly interested in the situation when 
the master sequence has just been destroyed, so that the population will typically 
be concentrated in the Hamming classes near 0. At equilibrium, the population is 
close to the Hamming classes in the neighborhood of &, = €(1 — 1/x). Thus we shall 
compute a lower bound on the probability of reaching the set 


NW = 1d € {Ont} d= 4 Vendo)" |. (19.1) 
As usual, we define the associated hitting time: 
Ty* = inf {n>1:D,€N*}. 


We recall that tT), is the time of exit from the neutral phase, or the hitting time of 
the set M: 
ty = inf {n>0:D,EM}. 


The meaning of “asymptotically” is recalled at the beginning of the chapter. 


Theorem 19.7 Asymptotically, we have 


2: 1 3m 
vdeN P(t: < “¢Ing, ty < ty |Do = d) > (1 ) 
a 
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Proof The proof of this lower bound is quite delicate. We will use the FKG inequality 
and the fact that the neutral distance process (D;,)n>0 has positive correlations. A 
consequence of the monotonicity property is that 


VdeN Vn=0 P(tx« <n|Do =d) > P(tw« <n|Do =(1)"). 


Recall that (1) is the column vector whose components are all equal to 1. Thus 
we suppose that the distance process starts from (1) and we will estimate the 
probability of a specific scenario leading to the entrance in the set N* before visiting 
the set M. Let us set 


3 1 
No = —(né)dniIné), NM, = —€lné. 
a a 
For @ large enough, the event 
D= {tm > No+M1, Dnyo+n, € N*, Duy = (In €)” } 


is certainly included in the event { tTy« < No + M < Ty }. In order to estimate the 
probability of this last event, we condition on the population at time No and we write 


P(D|Do =(1)") = 


i P(tm >No+M, Dnytn, € N*, Duy = 4| Do = (1)”) 
d>(né)” 


T™ >No+M Dn =d 
Dyo+n, € N* |tm > No 


Dn, = 4d 
tu > No 


| Po = "| . 


d>(Ingym 


Using the Markov property, we have 


m™ >N 
Dy, € N* 


Do =d}. 


P T™ >Not+N Dn, =d 
Dno+n, © N* ITM > No 


From the existence of a monotone coupling for the distance process, we obtain that, 
for any d > (In€)”, 


m™>N 
Dy, e N* 


m™ >N 


Dyn, € N* Do 


=P. 


Do = (In or) . 


Plugging the previous inequalities into the sum above, we obtain 


T™ >M 


P(D'|Do:=(1)") = P Dy, EN" 


Do = (In or 


x P(Dn, = (in €)”, tM > No | Do = (1)’") . (19.2) 
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We first study the last term in the above inequality. A crucial simplifying feature of 
the neutral case is that the selection and mutation steps can be decoupled, because 
the genealogy G is independent of the genotypes of the individuals. We condition 
on the genealogy of the process until time No to write 


P(Dny = (in €)”, Tu > No | Do = (1)’”) = 


SY) P(Dmy = (Ind), tm > No| Do = (1)",G = 8)P(G = g| Do = (1)"). 
&§ 
(19.3) 


Once the genealogy G is fixed, the trajectory of the process (Dn)n>0 depends only 
upon the mutations which occur along each lineage. The mutations can be realized 
with the help of i.i.d. random variables with uniform distribution over [0, 1], see 
chapter 4 of [15] or section 5 of [16]. With this scheme, conditionally on G, the 
events { Dy, = (In€)" }, {t4 > No} depend only on the mutation variables and 
they are non-decreasing with respect to these variables. By the FKG inequality for a 
product measure, 


P(Dyy = (Iné)”, tu > No| Do = (DG = 8) = 
P(Dyy = (In€)" | Do = (1), G = 8)P(tm > No| Do =(1)",G = 8). 
To estimate the last probability, note that in the initial population (1) each of the 


individuals has one wrong digit, i.e., a gene that is different from the corresponding 
gene of the master sequence. We consider the event & defined by 


_ juntil generation No, none of the initially wrong digits 
~ is transformed into a gene of the master sequence | © 


If the event & occurs, then, until time No, the master sequence has not been discovered 
and therefore T,j, > No, whence 


P(ty > No|Do =(1)",G =8) = P(E| Do = (1)",G = g) 
qd ae 


= P(E|Do =(1)") = (1- 


We have used the fact that & is independent of the genealogy. Putting this into the 
conditioning (19.3) and summing over g, we obtain 
P(Dn, = (Iné)”", tm > No| Do = (1)") = 


mNo 
P(Dy, = (Ine) | Do = ay” )(1 ~ =| 


We apply lemma 19.4 with b = In @, c = 1 and we obtain 


P(Dn, > (Inf) | Do = (1)") > (P(x > Iné|% = 1))". 
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Using the estimate of proposition 18.5, we get from the previous inequalities that 


1 m mN 
P(Dy, = (In€)”, tm > No| Do = (1)”) = (1 = —) (1 . — 7 ° (19.4) 


We study next the first term in the product of formula (19.2). We write 


P(Dy, —EN*, ty >™M|Do= (In 0") > 
P(Dw, EN*|Do = (In 0") a P(rm < N,| Do = (In 0”) _ (19.5) 


To control the last term, we shall rely on the following lemma. 


Lemma 19.8 Let 8 be the binomial distribution Bin(€, 1-1/x). For b € {1,...,€}, 
we have 


(0) 


Vn =0 P(tm <n|Do = (by) < rT 


Proof Letn = Oand be {1,...,€}. We write 
P(tm <n|Do = (b)"") = 
P(iaien Sie{1,....,m} DG =0|Dy=6™) 


< )\ S$) (Di) = 0| Do = (b)"). 


l<t<nl<i<m 


By proposition 19.1, for any t > 0, anyi € {1,...,m}, 


P(Dy(i) = 0| Do = (b)") = P(¥%: = 01 ¥ = 4). 
Using lemma 18.1 and lemma 16.2 together with the previous inequalities, we get 


P(tm <n|Do = (by) < nn 


as claimed. oO 


Using the inequality of lemma 19.8 with n = N; and b = In@, and lemma 18.2, we 
get 


2Inf\lne 
P(t™m < N,| Do = (In/)”) < ee < Nim - ) : 


Biln€) ~ 20) 


For the other term in formula (19.5), we apply lemmas 18.6, 19.4 as in the proof of 
lemma 19.5 and we get 


P(Dy, EN*|Dy = (In ¢") > (1 = aie (19.7) 


Plugging inequalities (19.6) and (19.7) into inequality (19.5), we obtain 
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lym 2In€ 
P(Dm, e N*, TM > Nj |Do = (In)”) = (1--) = Nim ; 


: x (19.8) 


Substituting the inequalities (19.4) and (19.8) into inequality (19.2), we obtain 


P(D|Do=(1)") = (1- 2,)"(1-—4 ml a vim(2ey"). 


Asymptotically, this last quantity is larger than (1 - 1/In a Moreover, for @ large 
enough, we have 


3 1 2 
Not+tN, = —(né)(nIné)+ -€Iné < -€lné. 
a a a 


These last inequalities yield the result of theorem 19.7. oO 


19.5 Escape from the Disorder 


The goal of this section is to estimate the discovery time T,,, defined by 
ty = inf {n>0:D,€M}. 
The meaning of “asymptotically” is recalled at the beginning of the chapter. 


Proposition 19.9 Asymptotically, we have 
2 t 
VdeN E(tm|Do =d) < —(€ln&x°. 
a 
Proof A consequence of the monotonicity property is that, for any d € N, we have 
E(t | Do = d) < E(t | Do = (€)”) . (19.9) 


Recall that (€)” is the column vector whose components are all equal to €. To 
bound the discovery time t,, from above, we consider the time needed for a single 
individual to discover the master sequence w*, and we remark that, if the master 
sequence has not been discovered until time n in the distance process, then certainly 
the ancestral line of any individual present at time m does not contain the master 
sequence. By proposition 19.1, the ancestral line of any individual present at time n 
has the same law as Yo,..., Y,,. Therefore, we conclude that 


Vn =0 P(tm >n|Do=@)”") < Po >n|=2), 


where 7 is the hitting time of 0 for the process (Y,)n>0. Summing this inequality 
over n > 0, we obtain the following upper bound: 


VdeN Vm2=1 E(ty|Do = 4d) < E(to|% =). (19.10) 
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We combine this inequality with proposition 18.7 and we obtain the desired upper 
bound. Oo 


Corollary 19.10 Asymptotically, we have 


Vd € {0,...,€}" P(rm > (In £)?x’ | Do = d) < (19.11) 


aln€” 
Proof We apply the Markov inequality and we use the upper bound given in propo- 
sition 19.9. Oo 


We close this section with a final estimate of the probability of leaving N* and 
entering M before returning to N*. Recall that the set N* is defined in formula (19.1). 


Corollary 19.11 We suppose here that a is finite. For any € > 0, we have, asymp- 
totically, 
VdeN*  P Dy =d me 

EN (tm < Tr | o= ) S$ Gop 
The case when a is infinite should be treated differently. In fact, when m grows 
extremely fast compared to @, the above inequality will not hold asymptotically, 
because the following scenario will have a larger probability: first one individual 
jumps to the left of €, — V@In € and before this individual has a chance to move back, 
another individual of the population discovers the master sequence. 


Proof Letn > 1. We write, ford € N%*, 


(19.12) 


P(tm < tx |Do =d) < P(ty <n|Do =d) + P(t >n|Do =d). (19.13) 


A consequence of the monotonicity property is that, for any d € N*, we have 
P(tm <n|Do=4d) < P(tm <n|Do =(& — Vélng)”), (19.14) 
P(tye >n|Do =) < P(ty- >n|Do=(0)"). (19.15) 


Let k > 1 and suppose that ty, = k. This means that some individual has discovered 
the master sequence w* at time k. Obviously, the ancestral line of this individual 
does not contain any other master sequence. By proposition 19.1, this ancestral line 
has the same law as Yo,..., ¥,. Therefore, we conclude that 


P(tm =k| Do =(& — VEInd)”) < mP(t =k|Y=&—VeInd), (19.16) 


where 7p is the hitting time of 0 for the process (Y,,),>0. Let 6 > 0 and let us introduce 
the hitting time 
Tr..6 = inf {n 2>1:¥,26(-6)}, 


and the time @ of the last visit to the set [ €,(1 — 6), €] before time 70: 


0 = max {n<1:¥, >&(1-6)}. 
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Note that 6 depends on the future of the process, hence it is not a stopping time. We 
condition the event { tT) = k } according to the value of 6 and of Yg and we apply the 
Markov property to get 


P(t =k|Y =& — Veln£) = 


P(t; =k, O=h, ¥%, =y|%=& — VElne) 
O<h<k y>€,(1-6) 


= > >, P(t =k, Y, <&(1- 6) forh <t<k|%,=y) 
O<h<k y>C(1-6) x P(t) > A, Yn = y|Y% = & — Veiné) 
= By y P(t) =k-h<%5|% = y)P(t >, Th =y|% =& —VEIne) 
O<h<k y>€,(1-6) 
<k sup P(t <t;,5|Yo=y). 
y2l(1-6) 


Let ¢ > 0. We choose 6 associated to € as in proposition 18.8 and we obtain that 


k 


P(| =k|Y=& - Vé In €) <. -—— 
(ke)! 


Plugging this inequality into inequality (19.16), summing over k, and using inequal- 
ity (19.14), we get 
mn 
S. 
(k—e)! 


The probability in formula (19.15) is controlled by corollary 19.6. Putting together 
the previous estimates, we conclude that, asymptotically, 


P(tm < n|Do = d) 


na 


er a 


P(t < Ty 


We supposed that @ is finite, thus we can choose n = €° to obtain the desired 
conclusion. Oo 


Chapter 20 ® | 
Synthesis cision 


In this short chapter, we gather the various asymptotic estimates developed in chap- 
ters 17 and 19 in the neutral and non-neutral phases in order to complete the proof 
of theorem 11.2. 


20.1 The Quasispecies Regime 


Let ¢ > 0 be fixed and let us define 
Ne = {ce S*:c(0) <e}, Mz = {ce S”:c(0) > Q(0)-e}. 


We apply lemma 16.4 to the concentration process (C,,)nen, its invariant measure v 
and the sets G = Nz, V = Mé and we get the following inequality: 


V(Ne) < sup P(ty, < tm: |Do =d) x sup E(ty:|Do = 4). (20.1) 


deM: deN“e 


Our goal is to prove that, in some suitable asymptotic regime, the probability v(Nz) 
goes to 0. We will rely on the estimates developed in chapter 17 and 19 to control 
each term of the right-hand side. In fact, the event { TN, < TM } corresponds to 
the escape from the quasispecies, which has been estimated in section 17.5, where 
we computed bounds on the process Z, = C,,(0). In particular, we have, for any 
dé M&, setting z = d(0), 

P(ty, < TM | Do = d) < Plt < TaQ(0)-« | Zo = z) ; (20.2) 
where tT, and t@o)-« are the hitting times of [0, ] and [Q(0) — «, 1] respectively. 
From lemma 17.14, we have, asymptotically, 


P(t, < Tm: |Do =) < exp(-mp(a)+me). (20.3) 
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To bound from above the expectation appearing in (20.1), we shall compute an upper 
bound on the time needed to enter the set M, which is uniform with respect to the 
starting point. We decompose the entrance into M in two steps: a first step where 
the process enters the non-neutral phase M, and a second step where it reaches the 
quasispecies equilibrium. To this end, we introduce the random time 


™T™ = inf {n >0:C,(0)>0}. 


Corollary 19.10 yields that, asymptotically, 
1 
Vd €{0,...,€}" P(rm > l(In )°x" | Dy = d) <5. 204) 


Once a master sequence has been created, a quasispecies is created with positive 
probability within a time of order Inm. Indeed, from theorem 17.8, there exists a 
C > 0 such that, asymptotically, 


1 
P(t <Clinm | C,,(0) > 0) = (m+henm . (20.5) 


We use lemma 16.6 with T4 = Ty, Tg = Tm, and together with inequalities (20.4) 
and (20.5), we get that, asymptotically, 


1 
i 2 7 2 
Vd €{0,...,0}" — P(tae < Ein)?! +CInm| Do =d) > eee 


Applying lemma 16.5, we deduce from the previous inequality that 
Vd €{0,...,€}" — E(ty-|Do =d) < m+ 1I)O™™ (Cdn l)?x! +Clnm). 


(20.6) 
Inequalities (20.1), (20.3), (20.6) yield 


W(Ne) < (exp (— ma) + me)) 2(m + 1)E™™ (Cdn ere +Clnm). (20.7) 
Suppose now that we are in the second case of theorem 1 1.2, that is, 
aw(a) > Ink. 


We can then find ¢ > 0 small enough so that -aW(a) + € + Ink < 0, and for this 
choice of e, the right-hand side of formula (20.7) goes to 0. 


20.2 The Disordered Regime 


We define 
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N* = {ce S™: c(i) =0,0<i<&-Veine}, 
M = {ceS?:c(0)>0}. 


We apply lemma 16.4 to the concentration process (C,,),en, its invariant measure v 
and the sets G= M,V = N°: 


vV(M) < sup P(tm < ty|Do =d) x sup E(tw-|Do = 4d). (20.8) 
deN* deM 


Our goal is to prove that, in some suitable asymptotic regime, the probability v(M) 
goes to 0. We will rely on the estimates developed in chapters 17 and 19 to control 
each term of the right-hand side. In fact, the event { TM < TN } corresponds to the 
escape from the equilibrium in the disordered phase, which has been estimated in 
section 19.5. Let e > 0. From corollary 19.11, we have that, asymptotically, 


m€® 


P(tm < ty |Do = d) < (aar 


(20.9) 


To bound from above the expectation in (20.8), we shall compute an upper bound 
on the time needed to enter the set N*, which is uniform with respect to the starting 
point. We decompose the entrance into N* in two steps: a first step where the process 
enters the neutral phase NV, and a second step where it reaches the equilibrium inside 
the neutral phase. To this end, we introduce the random time 


7 = inf (n=07C,0)S0}, 
From lemma 17.15, there exists a C(e) such that, asymptotically, 
Vd €{0,...,€}" P(r° < C(e)Inm| Do = d) > exp(-my(a)-me) . (20.10) 


Once the master sequences have been destroyed, the population reaches its equilib- 
rium in the neutral phase within a time of order €1n . Indeed, from theorem 19.7, 
we have that, asymptotically, 


a 1 3m 
vd € {0,...,€}"\M P( tw: < =¢In¢| Do = d) > (1-—) - 20.11) 
a In€ 


We use lemma 16.6 with t4 = Ty, Tg = T°, together with inequalities (20.10) 
and (20.11). We obtain that, asymptotically, 


Vd € {0,...,€}™ 


2 
P( tw < C(e)Inm + <£In€| Dy = d) > 
a 


3m 
(1 = —) exp (— mp(a) — me) > exp (— mp(a) - 2me) . 
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Applying lemma 16.5, we deduce from the previous inequality that, asymptotically, 
for alld € {0,...,€}”, 


E(tx«|Do =d) < (c(e)Inm + “ein ¢) exp (my(a) + 2me) . (20.12) 


Inequalities (20.8), (20.9), (20.12) yield 


y(M) < woo (C#) Inm + ein e) exp (my(a) + 2me) . (20.13) 


Suppose now that we are in the first case of theorem 1 1.2, that is, 
awa) < Ink. 


We can then find € > 0 small enough so that a(a) + 2€ — In(« — €) < 0, and for this 
choice of e, the right-hand side of formula (20.13) goes to 0. 


Part V 
Class-Dependent Fitness Landscapes 


Overview of Part V 


Parts II to IV dealt with the sharp peak landscape. Nevertheless, the type of results 
that we have presented for the sharp peak landscape hold for a wider class of fitness 
functions, namely the class-dependent ones. These functions are characterized by 
the fact that all sequences at the same Hamming distance from the master sequence 
share the same fitness. We present next the counterparts of the results of part II 
for class-dependent fitness functions. Chapters 21 and 22 deal with the equilibrium 
equation and its solutions. Chapter 23 gives probabilistic representations of these 
solutions based on the walk of a wandering mutant. Chapter 24 provides probabilistic 
interpretations of the generalized quasispecies distribution in terms of the Poisson 
random walk and the branching Poisson walk. Chapter 25 links the infinite population 
models of part I with the solutions of the equilibrium equation. 


Chapter 21 ® | 
Generalized Quasispecies Distributions oo 


In chapter 7, we obtained explicit formulas for the distribution of the quasispecies 
on the sharp peak landscape. To get these formulas, two ingredients played a key 
role: the Hamming classes and the asymptotic regime. Yet, the strategy employed 
for the sharp peak landscape still makes sense for a wider class of fitness functions, 
namely, the fitness functions that only depend on the Hamming distance to the master 
sequence. In this chapter we consider the analog of system (7.1) for class-dependent 
landscapes, and we derive a recurrence relation from it. We solve the recurrence 
relation and we explore some of its combinatorial properties. 


21.1 Class-Dependent Fitness Landscapes 


Class-dependent fitness landscapes are fitness functions that only depend on the 
Hamming distance to the master sequence. This is a natural class of fitness functions, 
which is often considered in mathematical genetics. For instance, when the fitness 
function has a single maximum and it decreases fast with the distance, it is poetically 
called a mount Fujiyama type landscape. In this and the two following sections, we 
consider the analog of system (7.1) for a general function Ay : N > R*: 


-h 
y(k) D1 yMAn(h) =D) y(MAn(We“* GT, KO. 1) 


h>0 O<h<k 


A solution y is a probability distribution if it satisfies the additional constraint 


Vk>0 y(k)>0, YS) y(k) = 1. 15) 
k=0 


The fitness function Ay will be explicitly present in many of the upcoming formulas, 
so in order to improve the readability of these expressions, we will omit the index 
H from the fitness function, and denote it simply by A. We are only interested in the 
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solutions of (21.1) that satisfy the constraint (21.2) and such that y(O) > 0. For if y 
is a solution with y(0) = 0, we can ignore the equation for k = 0, and the remaining 
system of equations falls into the form of (21.1) again. Thus, let us suppose that 
y(0) > 0. We look first at the equation for k = 0: 


y(0) )) AACA) = y(O)AMe™*. 


h>0 


Since we are assuming that y(0) is positive, the mean fitness, which is given by 
Dn>o Y(A)A(h), must be equal to A(0)e~*. We make the change of variables z(k) = 
y(k)/y(0), we replace the mean fitness by A(O)e~% in (21.1), and we divide both 
sides by e~“, thus obtaining the recurrence relation 


2(k)A(O) = by 2(h)A(h) k>1, (21.3) 


O<h<k ce hy 


with initial condition z(0) = 1. In order to get positive solutions, we make the 
following hypothesis. 


Hypothesis 21.1 We suppose that the fitness of the Hamming class 0 is greater than 
the fitness of the other classes, i.e., A(O) > A(k) for all k > 1. 


This hypothesis is coherent with the Hamming class 0 corresponding to the master 
sequence, which is the fittest genotype. The method of generating functions cannot 
be implemented as easily as on the sharp peak landscape. Now, with the help of 
hypothesis 21.1, the system (21.3) can be rewritten as 


— 


a(k) = >) Ath) ——— 


ak-h 
A A(k k—h)!’ ee 
(0) = A) 44, (k—hy! 
From this new system, it can be first guessed and then shown by induction that, for 


all k > 1, 


= a‘ A(O) A(iz) 
= Ta@ 24, Wo eal ill a-m 2 


O=i9 <-+-<inp=k 


The probabilistic eye will perceive the key role of the Poisson distribution in this 
formula. We will discuss this point further in a subsequent chapter. 


21.2 Up-Down Coefficients 


If we apply the previous formula (21.4) to the sharp peak landscape, we recover 
the formula (9.5) for the quasispecies involving the Stirling numbers. Indeed, in this 
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case, the last product depends only on A (it is equal to (7 — 1)~”) and the sum of 
the multinomial coefficients is precisely equal to att A There is yet another formula 
for the quantities y(k), which is the analog of the formula involving the Eulerian 
numbers in the case of the sharp peak landscape. In order to present this formula, 
we introduce the up-down numbers or up-down coefficients. Let n > 2, and let 


o = (o(1),...,0(n)) 


be a permutation of 1, ...,. The ascents and descents of o are codified by the Niven 
signature of o, that is, an array (q1,...,9n-1) € {-1,+1}""! such that the product 
qi(a(i + 1) — o(i)) is positive for all i. In other words, the coefficient g; is equal to 
+1 if o(i+1) > o(i), that is, if o has an ascent at 7, and to —1 if o has a descent at i. 
For instance, the Niven signature of the permutation (31542) is (—1, +1, -1, -1). The 
up-down numbers, which we define next, count the number of permutations sharing 
the same pattern of ascents and descents. 


Definition 21.2 Let n > 2 and let J be a subset of {1,...,2 — 1}. The up-down 


coefficient {7} is defined as the number of permutations of 1,...,7 having ascents 
in the positions J and descents elsewhere. In other words, it is the number of 
permutations of 1, ..., having for Niven’s signature 


+1 if ieTl, 


Vie{l,....n-1} an = if i¢J] 


It turns out that the quantities z(k) can be expressed with the help of the up-down 
coefficients. For all k > 1, we have 


A(0) A(i 
ro 5A), hte HL a0): a 


1<j<k iel 


2(k) = “| 


In the case of the sharp peak landscape, the last product depends only on the 
cardinality of J, it is equal to o~'/|; if we sum all the terms corresponding to subsets 
I of cardinality h, we obtain precisely the number of permutations of 1,...,k 
having h ascents, which is equal to the Eulerian number , This way we recover 
formula (9.12). 

We obtained the above formula by writing explicitly the coefficients for small 
values of k. With the help of Sloane’s on-line encyclopedia of integer sequences [83], 
we discovered that these coefficients were the up-down coefficients. Our first proof 
of the formula, done in [18], relied on a difficult combinatorial identity due to Carlitz 
[14]. We present here a simpler more direct derivation. The strategy is to think of 
this formula as a rational fraction in the variables A(1),..., A(k) and to compute 
its partial fraction decomposition, which turns out to be the formula given in the 
previous section. Thus we follow the inverse road that led us from the Eulerian 
numbers to the Stirling numbers when we were playing with the quasispecies on the 
sharp peak landscape. Let us start. We set 
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K = {1,...,k-1} 


and we rewrite the above formula as 


U1 xo=a) (aor) 40). 


1<j<k iel 


k 
2k) = = A00| 
yE-II-1 ag 


We then expand the power A(0 


ACHE! = TT] (AO) - AG) + AG) 


JEéK\I 
= > ([] 4 - aay) I] A): 
JCK\I \jeJ JAUK\I)\S 


Substituting this into z(k) and simplifying the factors (A(0) — A(/)), we obtain 
1 k 
ai) = HAY > ll OB | TJ au). 
TICK JCK\I \ jeKU{k}\J AO) — AG) AU jeK\J 
We reindex the sum by setting H = K \ J and we get 
a A(0) ( AC) \ k 
wes ( 1) Oi). 
KIA(k) HCK \ jeHu{k} A(0) - AQ) ion 
Let us fix H Cc K, say H = {i1,...,in-1 }, where 1 < h < k and 
O=19 <ijp <-++ <in-) <k =ip, 


and let us focus on the last sum )) cy {K. This sum is the number of permutations 
of 1,...,k whose ascents are located in the index set H. Let B = (Bi, ee Bn) be an 
ordered partition of {1,...,k } into / subsets such that 


Vi e{l,...,A} |B)| =i; —ij-1- 
We list the elements of each set B; in decreasing order: 
Vjie{L....h} B; = (b;(),...,b;(i; -ij-1)). 
We concatenate these lists into a single sequence: 
b\(1),..., b1 (i), B21), ..., boiz -— 1), -, Dn), .--, Bain — in-1)- 


This sequence corresponds to a permutation of 1,...,k. This construction defines a 
one-to-one correspondence between ordered partitions of {1,...,k } into / subsets 
of respective sizes ij, ...,i, —ip—1 and the set of the permutations of 1,..., k whose 
ascents are located in the index set H. The number of these partitions (called h- 
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sharing in the terminology of [61], see definition 1.17 and proposition 5.5 therein) 


is precisely the multinomial coefficient oY Sa and we conclude that 


> ("| 7 k! 
I (i) — i)! +++ Gp —in-1)! 


ICH 


In fact, this combinatorial identity and the above argument are the starting point of 
Carlitz’s work [14]. Carlitz’s goal was to invert this formula, i.e., to express the up- 
down coefficients as sums of multinomial coefficients. Plugging this identity into the 
formula for z(k), we are back to the formula obtained by induction in section 21.1. 


21.3 Re-Expansion 


Our first formula for the distribution of the quasispecies, given in definition 7.1, was a 
series. Summing this series, we got a closed formula involving the Eulerian numbers. 
With the help of a classical combinatorial identity, we could rewrite this formula 
in terms of the Stirling numbers. On the class-dependent fitness landscapes, these 
last two formulas were generalized into two formulas, one involving multinomial 
coefficients, the other involving up-down coefficients. We shall finally expand these 
two formulas in a series in order to obtain the generalization of our first quasispecies 
formula. The strategy is straightforward. We expand each fraction as a geometric 
series. Starting from the formula involving the up-down coefficients, we obtain 


directly 
fy (Amy.(aayt oy 
2(k) = k! bs (a A(0) Ic{l<i<k:j;>1} I 


Jive Jk 20 


Starting from the formula involving the multinomial coefficients, we obtain, after 


reordering the summation and setting 5) = i1,..., 5, = in —in-1, the following value 
for z(k): 
A(0) ak (Se2)" (Ae east aay" 
Vacs gpl : 
A(k) aa A(0) A(0) 
S]5--55h 21 


Syte+5p) =k 


Chapter 22 ® | 
Error Threshold on 


In the previous sections, we solved the recurrence relation (21.3) for z(0) = 1, and 
we explored some of the combinatorial properties of the solution. Nevertheless, our 
main interest is not in the recurrence relation itself, but in the limit equilibrium 
equation (21.1) from which it has been derived. Indeed, the solution to (21.1) must 
be given by 

2k) 


ON 


i20 


y(k) = k>0, 


which could be the trivial solution 0. Whether the solution defined by the above 
expression is the trivial one or not depends on whether the series with general 
term z(k) converges or not. In studying this convergence, we will find a criterion 
for the existence of a generalized quasispecies distribution, i.e., a solution of the 
equation (21.1) satisfying the constraint (21.2). This criterion we again call the error 
threshold, since, as we show in theorem 22.3 below, it is formulated in terms of a 
critical mutation rate a above which a quasispecies cannot form. Before jumping to 
the result about the error threshold, we treat the case of eventually constant fitness 
functions, which, in addition to helping us prove the existence of the error threshold, 
is not devoid of interest, for it shows a link between the quasispecies distribution 
for the sharp peak landscape (cf. definition 7.1) and the generalized quasispecies 
distributions. 


22.1 Eventually Constant Fitness Functions 


We study here the particular case of fitness functions that are eventually constant. For 
such functions, we can express the quantities (z(k))x+0 in terms of the concentrations 
(Q(c, a)(k))x>0 of the quasispecies associated to the sharp peak landscape fitness 
function. Let A be a fitness function which is constant and equal to 7 > 0 for 
some finite index onwards. Define (g(k))x>0 to be the solution to the recurrence 
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relation (21.3) with fitness function (A(O), 7, 7,...). The recurrence relation (21.3) 
satisfies a simple scaling property. The solution to the recurrence relation for a 
certain fitness function A remains unchanged if we multiply all the fitnesses by a 
constant. In particular, (g(k))x>0 is also the solution to (21.3) for the fitness function 
(A(O)/7, 1, 1,...), ie., 


(k) = (A(O)/ ; 
3 er wom COD; 
Theorem 22.1 Let N > 0 be such that 

A(N) #9 and Vk>N A(k)=7. 


Then, for all k > N, 


AG) ] 
A(0)- AG) AQ) - 9 


s jn-t- A(ir) 
«(+5 2 HG. ao-a5) 


1 0=ig<-++<ip<j t=1 


N ai 
z(k) = alk) + atk - al 
jal" 


Let us illustrate this phenomenon for the second simplest possible fitness function: 
A = (o,n,1,1,...) with o > 7 V 1. Then, the recurrence relation (21.3) for the 
fitness function A can be rewritten as 


ak! qk-h 
2ak)o = (Oo ++ Ung Di + ss ah) a pl F 


2<h<k 


Likewise, the recurrence relation (21.3) for the sharp peak fitness function, i.e. for 
the function (c, 1, 1,...) reads 


gio = 005+ Y) ty 


Substracting the second equation from the first one, and noting that z(0) = q(0) = 1, 
we get 


qk-! qk-h 


(2(k) - g{k))o = (en - g()) 7 + Dn, | (0) a) ay 


(k— 


Thus, setting 


u(o) = a=) uk) = 2k +1)-q(k+1), k21, 
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we obtain the familiar recurrence relation for u(k), 


u(k)o = uo, + » u(h) 


We conclude that 
zn - gQ) 

u(k) = u(O)q(k) = —————a(k), 

from where, for k > 2, 
zn — gq) 
2(k) = qk) + u(k — 1) = q(k) + ~~" — 1). 

Replacing z(1) and q(1) by their values, we get the expression of theorem 22.1. In 
fact, theorem 22.1 is a consequence of a more general phenomenon that we explain 
next. For a fitness landscape A and k > 0, we define the fitness landscape A“) 
obtained by shifting k places to the left the fitnesses of the different classes and 


keeping the fitness of the class 0, that is, 


A(0) if; =0, 


oe a= | 0 ifj>1. 


AQ) AG) 


Fig. 22.1 Example of a shifted fitness landscape. 
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For a fitness landscape A, we denote by (z4(k))x>1 the solution to the recur- 
rence (21.3) corresponding to A. The following lemma allows us to express the 
value of z4(k) as a function of the quantities ZAM —1),..., ZA), 
Lemma 22.2 For all k > 2, we have 

k k-1 ; : 
a _ 40) _, 52 A) 

kt A(O)— A(k)” 4 j! A) — AG) 


ZA(k) = ial aoe) 


j=l 


The result of theorem 22.1 is obtained by applying the lemma recursively until the 
quantity 24(k) is expressed as a combination of the coefficients of the quasispecies 
associated to the sharp peak landscape. 


22.2 Error Threshold 


In view of the previous section, it is straightforward to deduce the existence of an 
error threshold for fitness functions that are eventually constant from the existence 
of the error threshold for the sharp peak landscape. As we show next, we can extend 
the error threshold for eventually constant fitness functions to any fitness function 
satisfying hypothesis 21.1. 


Theorem 22.3 We have the following dichotomy: 

e If A(O)e* > limsup,,_,,.. A(m), then the series with general term z(k) converges 
and there exists a unique quasispecies. 

e If A(O)e~? < liminfy,.. A(n), then the series with general term z(k) diverges and 
no quasispecies exists. 


It is remarkable that the error threshold depends only on the average number of 
mutations per genotype per reproduction cycle a, the fitness of the master sequence 
A(O), and the limiting behavior of the fitness function A. Before proceeding to 
the proof of the theorem, we give a useful lemma that allows us to compare the 
solutions of (21.3) for different fitness functions. For a fitness function A, we denote 
by (z4(k))x>0 the solution to the recurrence relation (21.3) corresponding to the 
function A. 


Lemma 22.4 Let A and B be two fitness functions satisfying A(O) = B(O) and A(k) = 
B(k) for all k > 1. Then, for all k > 0, we have z4(k) > z?(k). 


Proof The result follows from the inequality 


1 a | ak-j 1 = a 
ane . - 
AO = Aw 2 MOET ® B= BH yO 


along with an induction argument. oO 
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We proceed now to the proof of the error threshold theorem 22.3. 


Proof Suppose first that the function A is constant and equal to 7 > 0 from N 
onwards. In view of theorem 22.1, the behavior of the series of general term z(k) is 
the same as that of the series of general term q(k), i-e., it converges if A(O)e~% > 7, 
and it diverges otherwise. If the function A is not eventually constant, we set 


n* = limsup A(n), yn = liminf A(n). 


Let us prove the first statement of the theorem. Let ¢ > 0, pick N > 0 large enough 
so that for all k > N, A(k) < n* +e. We define the function A% by: 


eas am =| A) ifO<k<N, 

nu +e ifk>N. 
For € small enough, A’ (0)e~¢ > n* + €. Since AN is constant and equal to n* + € 
from N onwards, the series with general term zAN (k) converges. By the comparison 
lemma 22.4, the same is true for the series with general term z4(k). We prove next 
that if A(0)e~* < 7”, then the series with general term z(k) diverges. We define the 
function Ay by: 


A(0) ifk =0, 
Vk>0 An(k) = 0 ifl<k<N, 
yn -€ ifk>N. 


For € small enough, we have Ay (0)e~% < n~ — €. Since Ay is constant and equal 
to 7” — e from N onwards, the series with general term z4% (k) is divergent. By the 
comparison lemma 22.4, the same is true for the series with general term z4(k). O 


22.3 Further Solutions 


The quasispecies distribution discussed above, when it exists, is the only solution of 
the system (21.1) satisfying both the constraint (21.2) and the condition y(O) > 0. 
In the case of the sharp peak landscape, the quasispecies distribution is even the 
only solution satisfying the constraint (21.2). However, for a fitness function A 
satisfying the hypothesis 21.1, additional solutions may exist. We now drop the 
condition y(0) > 0. Let K > 0. We call (y(k))x>0 a quasispecies distribution around 
K associated to A if it is a non-negative solution of (21.1) such that 


y(0) = --- = y(K-1) = 0 < y(kK), 


SY) y&) = dis 


k>K 
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Lemma 22.5 Let K > 0, and define the fitness function B by 
Vk >0 B(k) = A(K +k). 


The sequence (y(k))x>0 is a quasispecies distribution around K associated to A 
if and only if the sequence (y(K + i))i>0 is a quasispecies distribution around 0 
associated to B. 


Proof Let the sequence (y(k))x>0 be a quasispecies distribution around K associated 
to A. Since y(0) = --- = y(K — 1) = 0, we have, for all k > K, 


k aki 
=») WAVE" G ~ y(k) »» yGAG)-. 
J=K 


We set i = k — K and h = j — K in the above formula and we see that, for all i > 0, 


0 = - y(K + h)A(K + h)e~4 aa y(K +i) oy y(K + h)A(K +h) 
h=0 h>0 
= SK + MB Neo y(K +) Ty + BLN). 
h=0 (i—h)! h>0 


Therefore, the sequence (y(K +7));>0 is a quasispecies distribution around 0 associ- 
ated to B. The converse implication is proved similarly. Oo 


Lemma 22.6 Let us suppose that there exists a K > 1 such that 


A(K A(k). 
( ae ee (&) 


Then, fork € {0,...,K — 1}, no quasispecies distribution around k associated to 
A exists. 


Proof Let us suppose that the sequence (y(k))x>0 is a solution of the equation (21.1). 
Let k € {0,...,K — 1} and let us suppose further that y(0) = --- = y(k —1) = 0 
and y(k) # 0. We will show that if y(k) > 0,..., y(K — 1) > 0, then necessarily 
y(K) < 0. On one hand, writing down the K-th equation of (21.1), we see that 


K-1 


ON ae! 
y(K) = yVAe (Kap! 


1 
#(y) — A(K)e~* 2 


where ¢(y) denotes the mean fitness of y. On the other hand, writing down the k-th 
equation of (21.1), since y(O) = --- = y(k — 1) = O and y(k) > 0, we conclude that 
o(y) = A(k)e~*. Since A(k) < A(K), if y(k) > 0,..., y(K —1) > 0, then necessarily 
y(K) < 0. This implies that no quasispecies around k associated to A exists. oO 


For a fitness function A, we define the set of indices [4 by 
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iin {i > 0: A@e~4 > limsup A(j) and A(i) > sup A(j)}. (22.1) 
er up 


j>i 


We remark that the set 7,4 is finite. Combining the two lemmas above, we are 
able to characterize the set of the solutions of the system (21.1) that satisfy the 
constraint (21.2). 


Theorem 22.7 The system (21.1) has as many solutions satisfying the constraint 
(21.2) as there are elements in I4. Moreover, for eachi € I,, the associated solution 
y! satisfies 

y'(0) =... = y'G@-1) = 0 < y'Q@). 


Chapter 23 ® | 
Probabilistic Representation “ae 


We follow here the same road as in chapter 8, but we present only conjectures and 
possible directions for future investigations. We prove that the Perron—Frobenius 
vector of the finite system converges, in the long chain regime, to the quasispecies 
distribution and we try to perform a formal passage to the limit in the probabilistic 
representation of the Perron—Frobenius vector. This way, we obtain probabilistic 
representations of the generalized quasispecies distribution, as well as a probabilistic 
criterion for its existence. 


23.1 Asymptotics of Perron—Frobenius 


As usual, we consider the mean reproduction matrix W defined by 
Wij) = An(@)Malti, J), O<ij<l. 


We denote by c* the normalized Perron—Frobenius eigenvector of W, and by A the 
Perron—Frobenius eigenvalue of W. 


Proposition 23.1 We have the following dichotomy. In the asymptotic regime € — 00, 
q — Oand tq > a, 

e if Ay (O)e~% < liminfy_,~ Ay(k), then c* > 0. 

e if An(O)e~% > limsup,_,.. An(k), then A > Ap(O0)e~4 and c* > y°, where y° is 
the quasispecies distribution defined in theorem 22.7. 


Proof Let us denote by 2 the Perron—Frobenius eigenvalue of the matrix W, and 
recall that 2 represents the mean fitness of the population at equilibrium, that is, 
A= ¢(c*). Up to the extraction of a subsequence, we can suppose that the following 
limits exist: 


A= lim aA, n(k) = lim c*(k), k2>O. 
C00, q>0 00, q>0 
€q-a €q-a 
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Note that 7* must belong to S*. Moreover, by Fatou’s lemma, we have 


Sin) <1. (23.1) 


k>0 
Writing down the k-th equation of the system A(c*)’ = (c*)' W, we see that 


k 


do WAnWMuli, b) - Ac*(k) 


i=0 


< max Ayp(i)My(i,k). 
k<i<€ 


The right-hand side goes to 0. Taking the limit, we see that A* and 7* satisfy the 
system 


aki 
Vk>0  a*n*(k) = sii (i) Ap (De4# ——— api (23.2) 
1=0 


The first equation of the system A(c*)’ = (c*)" W yields that 
c*(0)An(0)My (0,0) < Ac*(0). 


Since c*(0) > 0, we can divide by c*(0) and, passing to the limit, we conclude that 
A* > Ay(0)e. Suppose that 7* is not identically equal to 0. Let kg > 0 be the 
smallest index such that 7*(ko) # 0. Writing down the ko-th equation of the above 
system yields that A* = Ay(ko)e~%. By hypothesis, we have that Ay(k) < Aq(0) 
for k > 1, thus the only possibility is that ko = 0 and therefore A* = Ay(O)e~%. In 
particular, the system (23.2) becomes the same as the system (21.3), that was already 
discussed in theorem 22.3. If Ay(O)e~% < liminf,_,.. Ay(k), the only solution of the 
system satisfying in addition (23.1) is the null solution, thus 7* has to be identically 
equal to 0 and c* — 0. If An(O)e~* > limsup,_,,, An(k), then y° is the only 
solution of the limit equation (21.1) belonging to S®, satisfying both the constraint 
(21.2) and the condition ¢(y°) > Ay(0)e~“, thus we conclude that c* > y®. Oo 


23.2 Mutant Walk Representation 


For simplicity, we assume that the following hypothesis holds. 


Hypothesis 23.2 The fitness function Ay is such that 


Ap(O)e“% > limsup Ay(k). 
k->00 


Under this hypothesis, we know from theorem 22.3 that a quasispecies exists. We 
denote by c® the Perron—Frobenius eigenvector of W such that c°(0) = 1. Of course, 
we have that c° = (c*(0))~!c*, and a consequence of proposition 23.1 is that c° —> z, 
where z = (z(k))x>0 is defined in formula (21.4). 
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We proceed as in section 8.2. We start from the probabilistic representation of 
the solution of equation (2.1) in terms of a stopped random walk. We consider the 
mutant walk (W,,)n>0 on A? which has M for its transition matrix. Let t* be the 
time of the first return to w*, defined by 


v* = inf {n>1:W, =u}. 


We apply proposition 5.1 in this specific setting, with w* for the starting point. The 
left Perron—Frobenius eigenvector c° of the matrix W satisfying c°(u) = 1 is given 


by 
Vue AS c°(u) = Zap > (17,2032 T]a ) : (23.3) 


n=0 


Let us examine what becomes of this formula in the long chain regime. Let k > 1 
be fixed. To alleviate the notation, we write simply EF instead of E,,*. Let us denote 
by i, the k-th Hamming class. Our assumptions yield that 


T*-1 n-1 
c(k) =, tim |) (w= | lim (>: (liver) []aon)}. 


fq-a ucH, fq—a n=0 


As in the case of the sharp peak landscape, the distribution of the random time t* 
converges towards e~“6; +(1 — e~) 5400. Passing to the limit in the previous formula, 
we obtain 


n-1 
: A(W;) 
k)= 1 E|1 a, lyr 
2(k) ae y ixsem (r*>n} [| 7 


€q-a 


foe} 


A(W;) 
7 lim, DUE [Hvscrertonenn [| 4 - (23.4) 


n=1 


tq-a 


Of course this passage to the limit would require a rigorous justification. However 
our goal here is to derive with a probabilistic method the combinatorial formulas for 
the quasispecies obtained in sections 21.1 and 21.2. To avoid lengthy details and the 
appeals to additional technical conditions, we simply perform the required passages 
to the limits without full justifications. Next, we rewrite the term in the sum as 


n-1 
A(W;) 
£[Lovcramnen a | = 


t=0 
n-1| n-1| 
A(W;) A(W;) 
e[toner fl 7 | = e(bovservinerf 7 
t=0 t=0 


Applying the Markov property to the second term, we obtain 
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n-2 


nT A(W, A(w* A(W,; 
E| lw, cruamiew's| | oo) ou Mai qyE e(tov.rmf ow). 


We have assumed that the Perron—Frobenius eigenvalue 2 converges towards 
A(w*)e~®. Thus the factor in front of the expectation above converges to 1. Putting 
the previous computation into formula (23.4), we obtain a telescopic series, and, up 
to the interchange of some limits that would require further justification, we expect 
that, for k > 1, 


_ na A(W; 
ss a ee [Hover | a) ve) 
€q-a 


23.3 Computation of the Limit 


While formula (23.4) is a generalization of formula (8.10) for a class-dependent 
fitness landscape, the last formula (23.5) did not appear naturally in the case of the 
sharp peak landscape. Our subsequent analysis rests entirely on this formula. To 
study this limit, we proceed as in section 8.2. We keep track of the mutation events 
with the help of the following random binary array. For 1 <i <nand1 <j < @, 
we set E;,; = 1 if a mutation occurs on the j-th digit at time 7, and 0 otherwise. The 
random variables E£; ; are i.i.d. Bernoulli variables with parameter g. We introduce 
the event €,, that, until time n, there is at most one mutation event in each column 
of the array (E;,;,1 <i <n,1 < j < €). As shown in section 8.2, the event &, is 
typical, i.e., its probability goes to 1, and we can restrict the expectation to the event 
&, without altering the limit. In order to compute the asymptotic behavior of the 
expectation, we introduce the jumping times of the Markov chain (W,,)n>0. More 
precisely, we define t) = 0 and for / > 1, 


T= min {t > t/-1: W; # Wr, , iz 


We then decompose the expectation as follows: 
n-1 
A(W, 
Ell = 
e( {W,, EH, En} [| |) - 


k r 
TH tyes Tp Hb Treat On A(x7) \ +1741 
Di Wee tes Tay) 236) 
r=1 xp=w* <x) <-+-<x;- EH, 4 pula cs 4 1=0 

l<t<-+-<t, <t-41)=n 


The notation x9 = w* < x; <--- <x, means that the sequence x),...,x,; is com- 
patible with the event 6,, i.e., there is at most one mutation on each site until time 
n. Let us fix one such sequence. We have then (with the convention that to = 0) 
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p(™ =t,...,T =, Tr4+1 > ") 
Wr, =%1,..-,Wr, = Xp 


I] (a _ ia | q aes = ge eG 7 gir) 


k-1 
O0<I<r-l 
qd k 
= 1-9)"(— 4 
(k— I —- 4) 
In the limit where n — oo and q — 0, this expression is equivalent to 
: gq \- 
Z | ) 
k-1 


Putting this into formulas (23.6) and (23.5), we obtain 


=. a. tim »% (4 )' ; (a ye (23.7) 


£00, q>0 n-00 k-1 A(w*) 
fq—a XQ=Ww* <x1<+++<x, EH, =1 
to=1<t) <-++<t, <t-4,)=n 
l<r<k 


The point is that we can now send n to o and then perform the summation over 
t,..., ty, tp41. Setting tb = t41 —t for 1 <1 <r, we get 


teste 2 el (SH! 


XQ=W* <x} <-++<x, CH, 
vst! 
peony 


qa 


= ans 2 (a i). (3) 


fqra *0= =w* <x <-++<x,- EH, Uys 
l<r<k ti. 


. q_\k Aw’) - A(x1) 
= peng om eal A(k) L | awry AG 7 


XQ=W* <x] <+++<x, EH, 
l<r<k 


€q-a 


Notice that the summation for the variable t/ starts at 0, which leads to the presence 
of the factor A(w*)/A(k). So far, we have not used the fact that the fitness function A 
is a function of the Hamming class. In the next step, we shall group together the terms 
in the sum which gives rise to the same value of the product. For r € {1,...,k}, 
we denote by M(r, k) the collection of the matrices of size r x k whose entries 
are 0 or 1 and which contain at least one one in each row and exactly one one in 
each column. To each x;,..., x, satisfying w* < xj <--- < x, € Hk, we associate 
a matrix M(x;,...,X,) in Mr, k) as follows. First we construct the binary array of 
size r x € which records the mutations occurring in the sequence x1,...,x;. More 
precisely, the entry of line i column j is equal to | if the j-th digit of x;-; and w* 
are the same, but it is different in x;, and it is equal to 0 otherwise. The condition 
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imposed on x1,..., x, implies that exactly k columns of this array contain non-zero 
entries. We keep only these & columns in order to obtain the matrix M(x1,...,x;), 
which belongs then to M(r, k). Now, the point of this construction is that the product 
in formula (23.8) depends only on the matrix M(x1,...,x,). More precisely, given 
a matrix M € M(r, k), we define its range as 


range(M) = { >, Mi j):1<hsr}. 


l<i<h 1<j<k 


The set range(M(x1,...,x,)) is the set of the Hamming classes visited by the se- 
quence x),...,x;. Now, if range(M(x),...,x,)) = R, then 


I] A(x1) = An(i) 
1-1 Alw*) — AG) An(0) ~ An(i) - 


In formula (23.8), we group together the terms which give rise to the same matrix 
M(x1,..., Xx), as follows: 


ss da ee Ss » 2 (= i) 


l<r<k aval 
€q-a BE MeM(r, k) w* Mc... me apes 
Ap (0 Au (i 
au”) Tan eas Wo) ~. (23.9) 
H(k) eect H(0) - An(i) 
Once M is fixed, to obtain a sequence x1,...,x,; such that M(x,...,x,) = M, we 


simply choose the & columns in the array of size r x € where the ones are to be 
located, as well as the types of the digits where mutations occur, so the number of 
such sequences is simply ({)(x — 1)*. Since 


A k 
lim («= 4 ) ae 
€ 0&0, q0 k k-1 k! 


tq-a 


we conclude that 


7 ak Ay(0) An(i) 
mo 2 2: aio...) Gao-ae? CO 
srsk MeM(r,k) ierange(M) 


23.4 Rearranging the Sums 


We shall finally compute in two different ways the sum above. Our first strategy 
consists in grouping the terms in the sums according to the range of the matrix 
M. Notice that the range of a matrix M in M(r, k) has cardinality r, and it always 
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contains k. Thus we write 


2(k) = 


a® Ay (0) Ani) 
d,  _Hanw® 1] appany 31D 


1 <r <k 0<i, <:++<i,=k MeM(r,k) Ree 
range(M )={ i1,...,4, } 


The product depends on the matrix M only via its range. We evaluate next the number 
of the matrices M whose range is equal to { ij,...,i-}. By definition, these matrices 
have i; ones in their first row, i2 — i, ones in their second row, and so on. Thus the 
number of terms in the innermost sum is 


k! 
i Min — i1)!-+-(K -H-a)! 


te € M(r, k) : range(M) = {i1,...,i,} } = 


We conclude that 


k at 
9 La, Ptipag HM HE Eira) 
Aun(0) An i) 
~ (23.12 
An) Jt ARO=Aa 


1<l<r 


We have recovered the formula (21.4). Our second strategy consists in grouping 
the terms in the sums according to the permutation associated to the matrix M. 
This permutation is constructed as in section 9.2. More precisely, let M be a matrix 
belonging to M(r, k). We screen the matrix M row by row from the top to the bottom 
and we record the sequence of the indices of the columns containing the ones. For 
1 <j < k, let us denote by no(j) the index of the column where the j-th one is 
encountered during the screening process. The resulting sequence no(1), ..., no(k) is 
a permutation belonging to S,, the collection of the permutations of k elements. We 
denote this permutation by 0(M). We rewrite formula (23.10) by grouping together 
the matrices M which give rise to the same permutation: 


a* An(0) An(i) 
a= ae (81) 
P| 2 onan k! An(k) ale Au (0) — An(i) 
o(M)=o 


We sum further according to the range R of the matrix M. Notice that, for M € 
M(r, k), the range has cardinality r and contains k. We obtain 


o-TD yy SO aw. ans 

Z = =, : 

oat, 1Srak Reha kM ir, e An(k) 7 An(0) — An(i) 
keR, iR|= T o(M)=o 


range(M)=R 
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We recall that the permutation o has a descent at position h if o(h) > o(h + 1). 
We denote by desc() the set of the positions where o has a descent. From the 
construction of the permutation 0(M) associated to M, we see that the indices of the 
descents of o(M) are necessarily included in the range of M. The point is now that, 
if we fix o € SG and a subset R of {1,...,k } containing the set desc(a) U {k }, 
then there exists exactly one matrix M in M(|R|,k) such that o(M) = o and 
range(M) = R. Therefore the innermost summation is reduced to one element if 
desc(a-) C R, and is void otherwise. Formula (23.14) becomes 


_ ak Ay(0) An(i) 
2(k) = » zs k! An(k) [| An(0) — An(i) 


oeS Rc{1,....k} 
keR,desc(a)CR 


7 ak Ap(0) An(i) An(j) 
~ kt An(k) Ay(0) - An(i) [| ( : ay aa 


ao €S,x iedesc(o)U {k } 1<j<k 


J¢desc(o) 


=) LH] aoa |) ao 


eee l<j<k 1€desc(o) 
— An) An(i) 
al Au) = An(i) An() , aoa 2s LN a 
desc(o)=I 
_ _— An) Ap(i) 
“Fi al _AnO)- An) , a dese(x) = 1 }| x calli (0) 
An(0) ("| ol 73.15 
Sa NL lag@) 28 


because the number of permutations of S; whose descents are J is in fact equal to the 
number of permutations of S, whose ascents are J, that is (ry. We have recovered 
the formula (21.5) involving the up-down coefficients. 


Chapter 24 ® | 
Probabilistic Interpretations eal 


In this chapter we give two different formulas for the generalized quasispecies dis- 
tribution. The first formula expresses the generalized quasispecies distribution as 
a functional of a Poisson random walk and the second one as a functional of a 
branching Poisson walk. 


24.1 Poisson Random Walk 


We try here to interpret the generalized quasispecies distribution as a functional of 
the Poisson random walk. We proceed as in section 8.2. We start from the proba- 
bilistic representation for the solution of equation (2.1) in terms of a stopped random 
walk. More precisely, we take for E the set of the Hamming classes {0,...,€}, for 
A the class-dependent fitness function, and for M the mutation matrix given by the 
difference of two independent binomial laws, as in chapter 6. To alleviate the formu- 
las, we drop the subscript H from the notation, and we write simply A, M instead 
of Ay, My. Applying the formula of proposition 5.1 in this particular context, we 
obtain that the equilibrium concentration of the master sequence is equal to 


1 
c*(0) = ; ‘ (24.1) 
TO n- 
“(Sef 
n=1 k=0 
Here (Z,)nen is the Markov chain on {0,...,¢€} with transition matrix M. Let us 


examine what becomes of this formula in the long chain regime 
€— ©, q- 0, lq a. 


For simplicity, we assume that the following hypothesis holds. 
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Hypothesis 24.1 The fitness function A is such that 


A(O)e"* > limsup A(k). 
k-00 


Under this hypothesis, we know from theorem 22.3 that a quasispecies exists. As in 
the previous chapter, we assume that in the long chain regime the Perron—Frobenius 
eigenvalue A converges towards A(0)e~“, while the corresponding Perron—Frobenius 
eigenvector converges to the quasispecies distribution. The limiting process for the 
Markov chain (Z;,)nen is again the Poisson random walk. Let (Y;,),>1 be a sequence 
of i.i.d. random variables with distribution the Poisson law P(a) of parameter a. We 
define 
Vn > 1 Sy = Yi, +---+Y%,. 


If we perform a formal passage to the limit in formula (24.1), we expect that c*(0) 
converges towards 


(24.2) 


1 
TO n-1 
A(Sx) 
eal (C1 AS) 
n=l k=0 A(O)e 7 
The random time 7 is now the first time of return to 0 of the Poisson random 
walk. Let us rework the expectation appearing in the denominator. Since the Poisson 
random walk (S;,)nen has non-negative increments, then either $; = 0 and 7) = 1, 
or S; > O and Tt) = +00. Therefore 


SYP] Alsi) ST ACS) 
1° (F] AS) = 1 ftom SF AS) 


n=1 


Thus we expect that c*(0) converges towards 


1 
co n-1 
A(Sx) 
1+ 2, faltsoo [| a 


We study next the concentration of the Hamming class i. We start from the repre- 
sentation formula (5.3) specialized to the class-dependent fitness landscape, which 
yields that, fori > 1, 


T-1 n-1 
c*(i) = c*(0)x al >, (1z-32" [Tac] : 
k=0 


n=0 


: (24.3) 


Performing again a formal passage to the limit, we expect that the concentration of 
the Hamming class i converges towards 
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A(S 
fats Sn=i} [ae] 


n>1 
n-1 : 
A(Sx) 
1+ 3 tise f I] [a] 


This yields a new formula for the quasispecies distribution, as a functional of the 
Poisson random walk conditioned on the event that S$; > 0: 


2 rift, 7 tl! ‘ioye# (51> 


n>1 


(24.4) 


Vi>1 yi) = (24.5) 


Yet a rigorous justification of the limits taken above is rather delicate. Moreover, 
it is not obvious that this last formula coincides with one of the formulas found in 
chapter 21. In the remainder of this section, we shall check that this is indeed the 
case. To do so, we introduce the times of the jumps of the random walk (S,;,)nen, as 
follows. We set Ty = 0 and for k > 1, we define iteratively 


t= int {ee has 8, SS) 
We fix i > 1 and we define 


N = inf {n>1: Sp, =i}. 


s\>0} - 


Tn+1-1 n-1 
A(Sz 
al (Sk) 


1 {N<co} =a 
n=Tyn k=0 A(O)e 


With this notation, we can rewrite 


dy fs [I ae 


n>1 


si>0| = 


Ft / A(Sr) | A(i) 


E 1 <oo 
( ee 7 Gare A(0)e~4 


)"|si>o) (24.6) 


We remark that the time intervals 7}4; — 7), / = 1, are independent of the random 
variable N and the process (Sz, );>1, and they are i.i.d. with distribution the geometric 
law of parameter 1 — e~?. Let k > 1 and let us set y = A(k)/(A(O)e~%). Let T be 
a random variable with distribution the geometric law of parameter 1 — e~%. We 
compute, for 0 < y < e%, 
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_ n+1-an et eo et _ A(k)(e% — 1) 
- 27 e "(1 —e4) = = _(e*-1) = Ome 
T-1 T T 
n\ _ l-y _ 1-E(y*) 
OD a eer l-y 
4 l-e#, 1 AQ) 
. sv ~ T-ye4 ~ AO)— AK)’ 


Therefore we can rewrite the last expectation in (24.6) as 
A(0) yv- “  A(Sri) 
Byl Woyeng ee a | > 0 


» nou serif a 


Osh<i 
1siy <+-<ip <ip4y=i 


The factor e“ corresponds to the term / = 0 in formula (24.6) (recall also that 7; = 1). 
We notice that 


: e 4 h+l a’ 
PUSt)= fis Sthay= dns) = ie) iM — il (KR in)t 


From these computations, we conclude that, for k > 1, we have 


[seen [| iy =a -6) 


n>1 
1 A(O) a A(ir) 
1-e4A@Q £4, i-ij)!---(k-i)! [| A(O) — A(i;) a) 


Ist) <--<ip <ing =i 


i h+1 


This sum is the numerator of formula (24.5). The denominator is a normalization 
factor, therefore we recover the formula (21.4) for the generalized quasispecies 
distribution. The probabilistic representation formula (24.5) yields in addition a 
non-trivial probability distribution even if the hypothesis 23.2 does not hold. This 
opens the way to obtain a criterion for the existence of the generalized quasispecies 
distribution. Indeed, summing the formula (24.5) over i > 1, we get 

+) | 


y(i) _ A(S;) 
ai) = =(l-2¢*)) 2 
Sa = Dayo eo Delf Layers 
Conjecture 24.2 There exists a unique quasispecies associated to w* and the fitness 


i>] 
function A if and only if 


Thus we have obtained the following conjectural criterion. 
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_A(S1) 
2, J Ae 


n>1 


so} < +00. 


Formula 24.5 expresses the generalized quasispecies distribution as a functional of 
the Poisson random walk. To go a step further, a natural idea consists in interpreting 
the factor in the expectation as a killing probability. Suppose that at time n, the random 
walk S, is in state i > 1. We toss an independent coin of parameter e“ A(i)/A(0) to 
decide whether the walk survives another unit of time or not. More precisely, we 
define, for any i,n > 0, 


P(r=n+1|S, =i,t =n) = q Ali) 
A(0) 


This defines a random time t whose distribution is a predictable function of the 
trajectory of the Poisson walk (S,)nen, ie., the event T = n+ 1 depends on n 
independent coins whose parameters are deterministic functions of the trajectory 
So,.--> Sn until time n. Of course, the definition of t makes sense only when the 
factors e“ A(i)/A(O) can be interpreted as probabilities. 


Proposition 24.3 Let A be a fitness function such that A(O) > e“A(k) for all k > 1. 
Let (Sn)nen be the Poisson random walk and let t be the random time defined above. 
The generalized quasispecies distribution is equal to the mean empirical distribution 
of the Poisson random walk between times | and T, i.e., 


Vk>0 yk) = a 5 #(D) Hse) (24.8) 
Proof Let k > 1 and let us set 
T = inf{n>1:S,=k}. 
We compute, with the help of a conditioning and the Markov property, 


(> 1(s,)} = de> 1,5, =k} 2 ] 
n=l n=1 


t>1 


> DD) Pfc =, Sy = = m+t-1 =k, T>m+tt—-1) 


t>1 m>1 

ees Tk =m p[ Smt = 00° = Sm = kT =m 

7 T>m tT>m+t—-1 T>m 
t>1 m>1 

_ 1 (Th =m 1 f/Si =H = Sh Hk i 

(Stem MP eerie 4)}- 
m>1 t>1 


We deal separately with each sum. First, we have 
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(nen) = > PUSS tier Se See Sa) 


m2=1, 0=so<:-- 
S$8m-1<Sm=k 


iam A(s;) _ Qin 
a max Lz =a |e" 1c 
m>1,0=s0<-  j=0 (O)e (sj+1 — 5))! 
$8m-1<Sm=k 


We reindex the sum according to the number h of distinct positive integers in the 
trajectory 0 = so < +++ < Sm-1 < Sm = k and we obtain 


i, An tj Faas 
2 e Uu-V) p-4 7 AG) 4 eo Util) ea ae E 
oy A(O)e~4 (ij+1 — ij)! 


1<h<k, to,...,tn-1 21 
O<i) <-++<ip_|<in=k 


- y praty T I] (<2)’ 1 
chek toaty zt |! A(O)} (ij+1 - i)! 
O<i <+++<in—1<in=k 


_ yy pe @ a® 7 A(i;) 1 
iznek l-e@ i,! jal A(0) — A(i;) (ij+1 —i;)! 
O<i1 <-++<ip_ <ip =k 


Second, we have 


= 
tT>t-1l 


t>1 


oe _g Ak) \' AO) 
= 4) = >| jo] = AO - Aw’ 


t21 


Collecting together the previous computations, we obtain 


T h ; 

et A(0) A(i;) 1 

E l(s,,=k ] = ——— a‘ Pee 

(> Peer Ale) 2 | aye AGE ij)! 
ig=0<iy <+++<ipn_1 <in=k 


and, up toa multiplicative constant, we recognize the formula obtained in section 21.1 
for the quantities z(k). oO 


Our probabilistic construction provides the following intuitive picture for the struc- 
ture of the quasispecies. The evolution of the genotype along a lineage is modeled 
by a Poisson random walk in the genotype space, starting from the master sequence. 
Because of the presence of the master sequence in the population, the lineages are 
bound to become extinct, after a random time which depends on their fitness history. 
A lineage is more robust if it visits genotypes whose fitnesses are close to the fitness 
of the master sequence. The time t models the survival time of a lineage. 
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24.2 The Branching Poisson Walk 


The probabilistic interpretation in terms of the Poisson random walk is valid only 
for fitness functions A such that A(O) > e“A(k) for all k > 1. However, we have 
seen that a generalized quasispecies distribution can be defined for fitness functions 
which violate this condition. To go around this obstacle, we shall employ a branching 
random walk in order to construct a probabilistic representation of the generalized 
quasispecies distribution. 

Our starting point is the branching process representation of section 5.2. We 
consider a multitype Galton—Watson model 


Ly =(ZA@ Os tse), n>0, 


with mean matrix given by 
4 : Tee 4 
Vife{L...€}  E(Zi(j)|Zo0 =e) = 5WGA). 


More precisely, the reproduction law ,/’ of the type i is taken to be the Poisson distri- 
bution with parameter A(i)/A. The offspring then mutate independently according to 
the mutation matrix M. Recall that Z,,() is the number of individuals of type i present 
in the n-th generation Z,, of the process. We stop the process (Z,,),>9 on the type 0 by 
killing the descendants of individuals of type 0 in any generation n > 2. The resulting 
process is denoted by (Cane Thus, in the stopped process eae the individuals 
reproduce and mutate as in the usual Galton—Watson process (Z,,)n>0, however from 
generation | onwards, the individuals of type 0 do not produce offspring. We denote 
by Eo the expectation for the process (Z°),,s0 starting with a population consisting 
of one individual of type 0. The normalized left Perron—Frobenius eigenvector c* of 
the matrix W is given by 


¥ £0(Z200) 


n>1 


>, p> Eo(Z2(/)) | 


n>10<j<€ 


Vie{l...,€} c@= 


Let us try to pass to the limit in this formula in the long chain regime, in the case 
where a quasispecies exists. We first have to guess the potential candidate for the 
process which is the limit of the stopped multitype Galton—Watson process (Z,,)n>0- 
The set of types becomes infinite, and in the limit it is the set of the non-negative 
integers N. The mutation process converges to the Poisson random walk, while the 
Perron—Frobenius eigenvalue 2 converges towards A(0)e~“. The natural candidate 
for the limiting process is the branching Poisson walk on N defined as follows. 


The branching Poisson walk. The state space of this process is N“. We denote by 
Zy the n-th generation, 


Zn = (Zn(0),Zn(1),...) € NX, 
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where fori > 0, Z,,(i) is the number of individuals of type i present in the population. 
To build Z,,.1 from Z,, all the individuals present in Z,, undergo a reproduction cycle. 
For individuals in class i, the number of offspring is distributed according to a Poisson 
law of parameter e“A(i)/A(O). Moreover each offspring is subject to mutation and 
is displaced to the right according to a Poisson law of parameter a. Of course all the 
Poisson variables involved in this process are independent from each other. 

We give next a graphical construction of this process. Let 


We i> Oa > Qes1): CP" Is 00k SLAs) 


be two collections of independent random variables, such that the variable N;” K has 
Poisson distribution with parameter e“ A(i)/A(0) and the variable Y,” KM has Poisson 
distribution with parameter a. The process (Z,,),en can be represented as a sum of 
Dirac masses on the integers: 

= >» Z,(i) 6; « 


i>0 


With this representation, the process can be defined recursively by setting Zp = 69 
and, for n > 0, 
Znli) NP* 


Pn = DD, > 4 eyhh 


iz0 k=1 h=1 


The process (Z,,)nen has a specific triangular structure. Indeed, for any i > 0, only 
the individuals which are at i or at its left can contribute to the number of individuals 
at i from one generation to the next, and we have 


n,k 
Znli) Nj 


Zr) = Dy Dy Dy ayes} (24.9) 


O<j<i k=l h=l 


Therefore, for any i > 0, the process of the first i + 1 marginals 


(Z,,(0), ses Zn(i)) nen 


is a Markov chain with state space N‘*!. 


Stopping the master sequence. We proceed next as in the finite case, i.e., we 
stop the process (Zyn)n>0 on the type 0 by killing the descendants of individuals 
of type 0 in any generation n > 2. The resulting process is denoted by (Z°),>0. 
Thus, in the stopped process (2450; the individuals reproduce and mutate as in 
the usual Galton—Watson process (Z,,),>0, however from generation | onwards, the 
individuals of type 0 do not produce offspring. We denote by Eo the expectation 
for the process (Z°),>0 starting with a population consisting of one individual of 
type 0. If we perform a formal passage to the limit in the finite formulas, then we are 
led to the following natural conjectures. We expect that the generalized quasispecies 
distribution exists if and only if the total expected number of descendants of the 
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master sequence is finite, i.e., 


In this case, we expect that the generalized quasispecies distribution is given by the 
following formula: 


by Eo(Z)(k)) 
n>1 


Vk>0  y(k) = SS __ (24.10) 


S1 >) £0(Z2)) 


n=1 j>0 


However, we will not be exploring this probabilistic representation further in this 
text, nor will we try to give a rigorous proof of this natural conjecture. 


Chapter 25 ® | 
Infinite Population Models “pate 


As for the quasispecies distribution in chapter 10, we explore here the relationship 
between the infinite population models introduced in chapters 3 and 4 and the limit 
equation (21.1), as well as its (potentially many) solutions. 


25.1 The Moran—Kingman Model 


We consider the Moran—Kingman model introduced in section 3.1, where E is taken 
to be the set of the Hamming classes {0,...,€}, along with the fitness function Ay 
and the lumped mutation matrix M7. To alleviate the formulas, we drop the subscript 
H from the notation, and we write simply A, M instead of Ay, My. Asin the previous 
chapter, we suppose that the fitness function A satisfies the hypothesis 21.1. Recall 
that we have defined the set S® as 


S” = {ce [01% : }' efi) <1}. 
TEN 


A sequence c € S™ represents the proportions of each type in a population where 
the number of possible types is infinite. Moreover, the sum of the proportions may 
be smaller than 1, which should be interpreted as if some of the mass has “escaped 
to infinity”. For such a sequence c, we denote by ¢(c) its mean fitness: 


o(c) = YAW. (25.1) 


120 


Passing to the limit on the mapping ® defined by (3.2), we obtain a mapping Po 
from S® to S*, which is given by 


a 
VkK>0 VeES® — Bo(cl(k) = (ce)! > c)ADe“ 7. 
O<i<k (ki)! 
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We can therefore define the limiting Moran—Kingman model as the dynamical system 
on S® associated to the iteration of the map Y..: 


Vn>0 — Cn = YoolCn) = P% (co). (25.2) 


The equations for the fixed points of the limiting dynamical system coincide with 
the limit equations (21.1). We have shown in section 22.3 that the system (21.1) has 
several solutions that satisfy the constraint (21.2), and these solutions are indexed 
by the set Z4 defined in formula (22.1). We denote by ij,...,ijz,, the elements of 
I,, arranged in ascending order. For 1 < h < |J,4|, we denote by y’” the solution 
associated to the index h, as defined in theorem 22.7. The asymptotic behavior of 
the sequence of the iterates (c,),>0 is given by the next theorem. 


Theorem 25.1 Let c € S® be the initial condition of the dynamical system (25.2). 
We have the following convergence: for 1 <h < |Ial, 


c(0) = +++ =c(in-1) =O < max ci) =~ lim Wc) = yin : 
In-1<USIp 1-300 


For any other initial condition, the iterates of ¥ converge to the trivial fixed point 0. 


25.2 The Eigen Model 


Finally, we consider the Eigen model introduced in section 4.1, where E is taken to 
be the set of the Hamming classes { 0,...,€}, along with the fitness function A and 
the lumped mutation matrix M. Passing to the limit on the differential equation (4.2), 
we obtain the infinite system of differential equations 


dc;(k) = 
dt 


k-i 
Dd, cAWe“*T—, — elk) O(Cr), (25:3) 


Vk>0 

O<i<k 4) 
where ¢(c;) is the mean fitness of c;, defined as in formula (25.1). We can thus 
define a limiting Eigen model by this system of differential equations. The equations 
describing the equilibrium solutions of the limiting Eigen model coincide again with 
the limit system of equations (21.1). The asymptotic behavior of the solution of the 
differential system (25.3) as t goes to oo is described in the next theorem. We use the 
same notations J,, in, y’” as in the previous section. 


Theorem 25.2 Let c € S®, and let (c;)+>0 be the solution of the infinite system (25.3) 
with initial condition co = c. We have the following convergence: for 1 <h < |Ial, 


c(0) =... =c(in_-1) =O < max c(i) = lime, = yi. 
In-1 <i Sin t—0o 


For any other initial condition, the trajectory cy, converges to 0. 


Part VI 
A Glimpse at the Dynamics 


Overview of Part VI 


So far, we have studied in quite some detail the models introduced in chapters 3 and 4. 
Up until now, we have focused mainly on the equilibrium aspect of these models, or 
on the convergence of the processes to their equilibrium state. Nevertheless, there 
is more to the relationship between the different models than that happening at the 
equilibrium level. The objective of this last part is to complete the picture by showing 
some of the convergences that take place for the trajectories of the different models. 


Chapter 26 ® | 
Deterministic Level on 


In this chapter we show the relationship between the Moran—Kingman model and 
Eigen’s model with their infinite counterparts. 


26.1 The Moran—Kingman Model 


We consider the Moran—Kingman model introduced in section 3.1, where E is taken 
to be the set of the Hamming classes {0,...,€}, along with the fitness function 
Ay and the lumped mutation matrix My. We suppose that the fitness function 
Ag Satisfies the hypothesis 21.1. The Moran—Kingman model is the discrete-time 
dynamical system on S“*! given by the iterates of the mapping ® defined by (3.2): 


VkE{0,....€} Vee S@!  @(c\(k) = A&cy! by c()An()Mn(i, k), 


O<i<€ 


where ¢(c) = Yig<j<e C()An(i) is the mean fitness of a population whose genotypic 
frequencies are given by the vector c. When & goes to oo, g to 0 and fq to a, we 
obtain a limit dynamical system on S® given by the iterates of the mapping Po, 
which is defined as 


k-i 
VEEO Wye S™ Wolk) = oy! D) OAUTH, 


O<i<k 


where 
VyEeS® dy) =) yAn(). 
i>0 
In the previous chapters, we have shown several convergence results concerning 
these two systems. We summarize these results below. As in the previous chapter, we 
omit the index H from the fitness function and the mutation matrix, and we denote 
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them simply by A and M. For the sake of simplicity, we assume in the sequel that 
the fitness function A converges to 1. 


Hypothesis 26.1 We suppose that A(O) > A(k) for all k > 1 and that 
jim A(k) = 1. 
Let c € S! and y € S® be such that 


Jim Dle(k) - y(K)| = 0. (26.1) 
k>0 


e Assume that A(0)e~% < 1, then the following limits hold: 


nha wo 


®"(c) > 
1 
i} 
1 
i} 
\ 
\ 
C00, q0 ; >, q0 
tqoa_ | tq-a 
1 
1 
1 
v . 
na oo 
¥o(y) >0 


e Assume that A(0)e~% > 1 and y(0) > 0, then the following limits hold: 


n— oo 
@"(c) > 
\ 
1 
1 
1 
i 
if 
€—00, q0 €00, q0 
€q-a j lq-a 
I 
\ 
\ 
1 
1 
N77 Vv 
nha ew 
Tou) > y? 


Here y® is the quasispecies distribution introduced in the statement of theorem 22.7. 
The limits represented by solid lines have been proven to hold in the previous 
sections, the purpose of this section is to show that the limits represented by the 
dashed lines hold too. 
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Theorem 26.2 Assume that c € S**! and y € S® are such that 26.1 holds. For 
every T > O and for every k > 0, we have 

lim sup [®"(c)(k) — PR (y)(K)] = 0. 

€q-a 
In order to prove this theorem, we need a bound on the probability of a backward 
mutation, which we give in the following lemma. 


Lemma 26.3 Let k > 0 be fixed. The probability of a backward mutation towards 
the class k goes to O, i.e., 


lim Mii, k) = 
C0, q0 
fq—a i=k+1 
Proof We have 
id é 
S) Mik = DY) PG-X'+¥' = 8), 
i=k+1 i=k+1 


where X' ~ Bin(i, g/(k—1)) and Y' ~ Bin(€—i, q) are independent random variables. 
Decomposing each term according to the value of Y', we obtain 


4 4 
Si Mik = YD Ply sh.xX' =i-k +h) 
i=k+1 i=k+10<h<k 
c 
< >) DI P(X si-k+h) 
i=k+10<h<k 
€ : , 
1 qd a _ qd y" 
2), (ben) : k-1 
i=k+10<h<k 


i) ich i iL ica 
< < as = ; 
ol by (2 a) 7 » hl Ds G—m!? 

i>k+10<h<k 


O<h<k * i>k+1 


For k > 0, define 
k+1 


fg = yig=4 


i>k+1 1-q 


Differentiating the function f;,(q) subsequently h times, we see that 


h an . i- a ad 
Aa) = Die DG -ae De = DY Gy 
i>k+1 i>k+1 ! : 


Thus 
e 


Ymans Y TAK. 
Wi 


i=k+1 O<h< 
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Yet, we can also compute f, ”) as follows: 


h Ot py I hd 
Ke(@) = Fla | =») (Su a) ali -4)') 


O<t<h 


h- 
= > 4 (k + 1)! Fama (, = 1)!( 1 ) _ 
1-q 
O<t<h 


H(k+1=-2! 
k+1 _ 1 h-t+1 
= Ds mn : Jot it = ) . 
O<t<h q 


Thus, 
a 
. ke LN gsgege L. VRcer 
DONS DD, | ' Ja ‘i 
i=k+1 O<h<k O0<t<h 
which goes to 0 when q goes to 0. Oo 


Proof of theorem 26.2. Recall that we have assumed in hypothesis 26.1 that the 
fitness function converges to 1. Let ky > 0, 6 > 0, and choose k* > ko such that for 
every k > k* we have |A(k) — 1| < 6. Let further ¢ be large enough, g small enough 
and €q close enough to a, so that 


k-i 
su MG, k)-e < 6, (26.2) 
ere py (k - mil 
SY) elk) - yO < 6, (26.3) 
O<k<k* 
Mii,k) < 6. (26.4) 


O<k<k* k<i<€ 


Let c’ ¢ S@! and y’ € S® and let us try to control |®(c’)(k) — Po(y’)(k)|. For 
k > 0, adding and substracting #(c’)"!6(y’)%..(y’)(k), we obtain 


c'(i)M(i, k) — y'(i)e* 


|D(c’)(k) — Poo(y’ (kK) < wes 2 A(i) (k —_ 


O<i<k 
ks 
Ms . -a 4 
me. di, c' (DAMM, k) + rom 0)! ou,” (i)A(i)e cm 
aki 
IG > Ae’ [MEK — aol 


O<i<k 


+ Fy. yi A@e“ Ty =k (@)-y "(| + i by c(i) Ai) Mi, k) 


k<i<l 


A(ije@ : 
+ aaliie’)- oy 1d, 70 Of om 
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Summing from k = 0 to k* and exchanging the summation order for all but the third 
term, we obtain further 


> joe’ Mk) —ooly wo) < x HO , ds, A(i)c’ Od, > [Ma k)-e “eo yi 


Zh 


aki 
AW Ie") — y'O 
"Ke Hc’) », Aol pits (k-i)! 


O<i<k* 


+ SY MAME 4) 


O<k<k* k+1<i<€ 


k-i 


1G) AC; _q a 
Rea OO) 40" ND ORO) De ay 


Let Amn denote the minimum of the fitness values, which is assumed to be 
strictly positive. By (26.2) and (26.4), the first and the third terms above are 
bounded respectively by 6 and A(0)6/Amin. The second term is bounded by 
A(0)/Amin No<i<k* |c’(i) = yj. The last sum involved in the fourth term is smaller 
than 1, so that the fourth term is itself bounded by |(c’) — A(y’)|/Amin. Therefore, 


Y, oer -¥0| < 29+ mn “i -y'@) 


a4 Amin O<i<k* 


‘)= O(y’)|. (26.5) 


It remains to bound the last term. For x € S®, we introduce the following fitness 
function: 


¢*(x) = by Aix(i) +) x(i) = by A(i)x(i) + 1 - ty x(i). 


O<i<k* i>k* O<i<k* O<i<k* 

Thanks to the choice of k*, we have |¢(x) — ¢*(x)| < 6. Therefore, 

oc’) - Hy] < 26 +|9%(c')- 6") < 26+4D DY) le’ -y'. 

O<i<k* 
Coming back to (26.5), we get 
44 0 7A 0 
Y focrw -%0V)| < 42 54229 Y ferm-yval, 
4. Amin O<i<k* 


Iterating this inequality, we conclude that there exists a constant C > 1 depending 
only on A(0), Amin such that 


vn=t DY forey- oD] < oS +0" YT feO-y'@). 


O<k<k* O<i<k* 
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Now taking for starting points c, y satisfying the inequality (26.3), and recalling that 
k* > ko, we see that, for any T > 1, 


T 
yn e{1,...,T} — |®"(c)(ko) — P(y)(ko)| < 6 C+ 6c, 


c-1 


This implies the statement of the theorem. oO 


26.2 The Eigen Model 


We consider the Eigen model introduced in section 4.1, where FE is taken to be the 
set of the Hamming classes {0,...,&}, along with the fitness function Ay and the 
lumped mutation matrix My. We suppose that the fitness function Ay satisfies the 
hypothesis 26.1. The Eigen model is defined by the system of differential equations 


dc;(k) zs 


Vk > 
>0 7 


D) cr An Mali, - cr(k)O(Cr), 


O<i<€ 


where 
br) = >* aAn() 
O<i< 
is the mean fitness of c;. The solutions of this system take their values in the set 
S'. Passing to the limit, we obtain the infinite system of differential equations 


dy(k) _ 


k-i 
dt 1 — yr(k)b0r), 


Vk >0 sy yDAn We" a—y 


O<i<k 


with 
VoeS® d(c)= by c()An(i). 
i>0 

The solutions now take their values in S*. In the previous chapters, we have shown 
several convergence results concerning these two systems. We summarize these 
results below. As for the Moran—Kingman model, we omit the index H from the 
fitness function and the mutation matrix, and we denote them simply by A and M. 
Let c € S! and y € S® be such that 


jim D/le(k) - y(K)| = 0. 
k>0 


We denote by (c;);s9 and (3;);>0 the solutions of the above systems with initial 
conditions c and y respectively. We have the following diagrams, depending on the 
value of A(O)e~?. 
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e Assume that A(0)e~% < 1, then the following limits hold: 


t— © 


e Assume that A(0)e~“ > 1 and y(0) > 0, then the following limits hold: 


t — co 


Here y° is the quasispecies distribution introduced in the statement of theorem 22.7. 
The limits represented by solid lines have been proven in the previous sections, we 
prove next that the limits represented by the dashed lines hold too. 


Theorem 26.4 For every T > 0 and for every k > 0, we have 


lim sup |c;(k) — y(k)| = 0. 
ai q-0 0<t<T 
qa 


Proof We have, for k > 0, 


cil) = cok) + | Y. eAOME, K)~ ese) ds. 


Osis 
t k-i 
yi(k) = valk) + f d) »s@AWe™* = Di ~ yx(k) 6%) ds. 
O<isk i 


Let ko => 0,6 > 0 and choose k* > kg such that for k > k* we have |A(k) — 1| < 6. 
Let further ¢ be large enough, g small enough and €q close enough to a, so that 
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DY} leolk) = yolk) < 6, (26.6) 
k>0 
qk-i 
sup [Mi k)—e74 | eg: (26.7) 
Osisk* ; SEEK (k — i)! 


M(i,k) < 6. (26.8) 
Ds 


O<k<k* k<i<€ 


We have then 
Icr(k) — ye(k)| we — yo(k)| 
ey 5 A(i) if 


+ Y AOME [ coli) ds + ‘A les()bles) ~ ys(k)OOs)| ds 


i=k+1 


qk-i 


(k — ail 


cs(i)M(i, k) — ys(e™? S$ — 


k-i t 
ay: 
Mii, k) —e coal f Cs(i) ds 


< |eo(k) - yo(k)| + 3 A(i) 


i=0 


Sane a al les (i) - ys(i)| ds + 3 A(i)M(i, of Cs(i) ds 


i=0 i=k+1 
+ aH Ma ass i cs(k) |(¢s) — 4(9)| ds 


Summing from k = 0 to k* and exchanging the summation order in the second and 
third terms, we obtain: 


Dd) lek) = yO sD) lol) = yolk) 


O<k<k* O<k<k* 
aki t 
+ Ai) >, (Me k)- : | [ c,(i) ds 
Ou i<k< ic ~ i)! 0 
+ YA 7 ong a 7 les - ysl] 
O<i<k* i<k<k 


+ S 3 A(i)M(i, of Cs(i) ds 


O<k<k* i=k+1 


+f Hn) DY fel) ~ reas 


<k <k* 


+ fd este) |o(e.)- 60) as. 


) O<k<k* 
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By (26.6) the first term is bounded by 6. Likewise, by (26.7) the second term is 
bounded by A(0)ét. The sum over k in the third term is bounded by 1, so that the 
third term, as well as the fifth one, are both bounded by 


t 
AO) [YY les -ysldlas. 
0 O<i<k 
By (26.8), the fourth term is bounded by A(0)6r. Therefore 


SY) leek) — ye(K)| < 5 + 2AO)Ot 
O<k<k* 


+240) [° Y le—y@lds+ [ I6le.)- olds. 


O<i<k* 
For x € S®, we introduce the following fitness function: 
g(x) = DY) A@al)+ Dox = J) A@a)+1- DY) xi), 
O<i<k* i>k* O<i<k* O<i<k* 


Thanks to the choice of k*, we have |@(x) — ¢*(x)| < 6. Therefore, 


/ |(cs) — #ys)|ds < 26r + ‘| I6"(cs) - 6°) 
0 0 


< 20t+ 40) [ - Ics(k) — ys(k)| ds . 


O<k <k* 
We conclude that there exist positive constants C), C2 such that 
t 
DY) le) < cv +@ LY heck) slab. 
O<KEk* 9 O<kEk* 
So, by Gronwall’s lemma, 
ler(ko) - ye(ko)l <>) lerlk) — ye) < CSL V De. 
O<k<k* 


In particular, for € be large enough, gq small enough and fq close enough to a, we 
have 
sup |c;(ko) — y(ko)| < Ci6(1 Vv T)e@? . 
O<t<T 


We conclude that 


limsup sup |c;(ko) — y(ko)| = 0, 
€00, q-0 0<t<T 
tq-a 


and this is the desired result. oO 


Chapter 27 ® | 
From Finite to Infinite Population eal 


In this chapter we perform limiting procedures on both the Moran model and the 
Wright—Fisher model, obtaining law of large numbers type results for the trajectories 
of both processes. The limiting deterministic processes are the familiar Eigen model 
and the Moran—Kingman model respectively. 


27.1 From Moran’s to Eigen’s Model 


We consider here the general Moran model for a finite set of types, as defined 
in section 4.3, and we show that, when the population size goes to infinity, the 
trajectories of the Moran model converge in probability to those of the Eigen model. 


Theorem 27.1 Let (C;);>9 be the concentration process associated to the Moran 
model as in section 4.3. We assume that the initial condition converges towards 
ces: 

lim Co =C. 

m—oo 
Let (cy)¢>0 be the solution to Eigen’s system of differential equations (4.2) with initial 
condition co = c. Then, for every 6,T > 0, we have 


lim | sup |Cint — Cr 


Mee O<t<T 


> 6] = 0. 


Proof In order to prove this theorem, we use theorem 11.2.1 in [34], which states 
in fact that the convergence happens almost surely. In order to use theorem 11.2.1 
in [34], it is enough to check that 


D, lei-eil sup {et > c(i) An(iyMin(h.)| < 00 


O<i,j<€ ce O<h<t 
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and that the vector field F defining the system of ODEs c/ = F(c;) satisfies 
Vac’ eS |F(c) — F(c’)| < M|c-c’| 


for some constant M. The first condition is satisfied, since the supremum involved 
is bounded by A7(0) and thus the whole expression is bounded by 2A7,(0)(€ + 1)’. 
As for the second condition, recall that F is defined by 


VeeS VkE{0,...,6} Fc) = )* ch)An(h)Mu(h, k) - ck) O(c). 


O<h<t 
Therefore, 
Fey =Fe))| 
< > |c(h) — c'(h)|An(h)Ma(h, k) + d(c)le — ¢’| + |o(c) — o(c.)| 
O<k,h<€ 


< 3Ay(0)|c —c’]. 


We can thus apply theorem 11.2.1 in [34] to our case, and we obtain the desired 
result. Oo 


27.2 From Wright-Fisher’s to Moran—Kingman’s Model 


We consider here the general Wright—Fisher model for a finite set of types, as 
defined in section 3.3, and we show that, when the population size goes to infinity, 
the trajectories of the Wright—Fisher model converge in probability to those of the 
Moran—Kingman model. 


Theorem 27.2 Let (C;)n>0 be the concentration process associated to the Wright 
Fisher process as in section 3.3. Let ® be the mapping from S to S defining the 
Moran—Kingman model, as given by (3.2). Let 6,N > 0. For every c € S and for 
any sequence (Cm)m>1 such that 


1 
Vm > 1 eC, eS. = Si—mM, lim cm = c, 
m m—coo 
we have 
lim P( max |C, — ®"(c)| > 5| Cy a cm] = 0. 
m—0o O0<n<N 
Proof Let c € S». We have, for 6 > 0 andn > 0, 


6 


P(|Cnsi ~ 0(0)| > 5|Cn = c) ey P(|Cusi(u) ~ &(c)(u)| > a 


ucE 


nae 
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Conditionally on C,, = c, the random variable mC,,,(u) is distributed according to 
a binomial law of parameters m and ®(c)(u). Thus, conditionally on C,, = c, 


E(Cyai(w)) = O(C\(u), 
O(c)(u)(1 — O(c)(u)) s 1 


m ~ Am~ 


Var (Cn+i(u)) | 


Using Chebyshev’s inequality, we conclude that, for all c € Sy, 


[E/? 
Vo >0 P(|Cn —@® > 5|Cn = )< ; 
[Cres — @(0) 4m6é* 


Let now 6 > Oand N € N. The mapping 9 is uniformly continuous on the simplex S, 
we can therefore choose positive numbers 60, ..., dx satisfying 6, < dforO <k < N 
and 


Vk e{0,...,.N-1} Voc’ é€S 


0, 
lc-c’| < & = |®C)-®’)| < > 
Let co € Sm and let us define 
Vk e{1,...,N} cK = D(cg-1)- 
We have 
o| max Gy —Cn| > 5|Cy = a] 
l<n<N 
< P(3kE{1...,N} [Cx -cel > dx] Co = co) 
N-1 
< P(|C - || < 61,.. .|Cx - cr < Ons [Ce - Cr+ > dxa1] Co = co) 
k=0 
= 
< os P(|Cesi — cesi| > Ox+1||Ck - ce] < bx] 
k=0 
N-1 
< sup { P( [Cia - cK+1| > Opa1| Ce = c) >¢ € Sm, |e — cxl < OK | ; 
k=0 


Yet, we deduce from the definition of the 6,%’s and from the previous calculations 
that, for all c € S,, such that |c — cg] < Ox, 


IE|? 
mo 


Ok+1 
2 


P(|Ck+1 ~ ces] > dex {Ce = 0) < P(|Cer1 - M0) > #10, = c) < SR. 
k+1 


We conclude that 
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NEV 


Ch -Cn = ae 
mMiNi <k<N Of, 1 


| max 


l<n<N 


> 6|@ =a < 


which goes to 0 when m goes to infinity. oO 


Chapter 28 ® | 
Class-Dependent Landscapes sical 


In this chapter, we consider a fitness function that is both class-dependent and 
eventually constant. We analyze the behavior of the Moran model and the Wright— 
Fisher model in the asymptotic regime 


moro, €>0, q->0, qa, oa. 


For the Moran model, we state a large deviations principle, as well as a couple of 
its consequences. For the Wright—Fisher model we state the analog of theorem 11.2. 
We work with the following hypothesis. 


Hypothesis 28.1 We suppose that the fitness of the Hamming class 0 is greater that 
the fitness of the other classes, i.e., Ay(0) > Ay(k) > 0 for all k => 1. We suppose 
further that there exists a class K > 1 such that Ay(k) = 1 forall k > K. 


28.1 Moran Model 


We consider the concentration process (C;);>9 associated to the Moran model as in 
section 4.3. We take E to be the set of the Hamming classes {0,...,€}, along with 
a fitness function Ay satisfying the hypothesis 28.1 and the lumped mutation matrix 
My. As shown in theorem 27.1, when the population is large, the Moran model 
tends to follow the trajectories of the associated Eigen model. Moreover, when f is 
large, q small and fq is close to a, the Eigen model converges to its infinite version, 
which is given by the infinite system of differential equations 


dy,(k) = 


Vk> 
>0 i 


k-i 
Dy WOAnDe "Ty ~ HO. 28D 


O<i<k 


Note that under the hypothesis 28.1, we can write the mean fitness as 


dy) =D) Ay +1- D1 yt), 


O<k<K O<k<K 
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so that for any k > K, the dynamics of the coordinates 0,...,k does not depend on 
the remaining coordinates. For k > K, we define the set D* by 


D* = {ce [0,1]! : c(O) +--+ +c(k) < 1}, 


as well as the subset 
DE = Dn (Z/m)*". 


In the sequel, we consider the restriction of the system 28.1 to the set D*. Likewise, 
we consider the restriction of the Moran process to the set DX. We still denote these 
restricted processes by (y;);>0 and (C;);>0, but it should be clear from the context 
whether we are referring to the original processes or to the restricted processes. 
Considering these restricted processes allows us to work on a finite-dimensional 
state space, rather than an infinite-dimensional one. Let d denote the Euclidean 
distance in D*, and let dr denote the supremum distance between two trajectories 
in D*, ie., for Wow’: [0,T] > Dk, 


dr(.w') = os d(w(t), W(t) - 


The large deviations principle will quantify how likely it is for the Moran process 
to be close to a given trajectory in D*, the distance being measured in the above 
supremum metric. We define next the large deviations action functional, following 
chapter 5 in [38]. Let y ¢ D* and define, for 0 < i,j < k, 


i-h 


_q a 
r(eisy) = - inh) Dy rwAlhe ae 
inh 
r(-ei; y) = n( 40) - », De ynA(hje~“@ 2 => 
O<j<k j<ASk (jh)! 


-h 
-q a 

r(er—ey:y) = yD) ynA(he*——— 

O<hsi (i—h)! 


Let V be the set of vectors 
V = {e,-e:0<isk}U{e-e:0<ijskisi}. 


The rates r(v; y) can be seen as the limiting rates for the Moran process, and relate 
to the system of ODEs as follows: 


We define the function G by setting, for all y ¢ D* and z € R**!, 


GO3z) = ))r@y)(e*?- 1), 


veV 
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and we define the mapping H to be the Fenchel—Legendre transform of G with 
respect to the second argument: 


H(y;u) = sup (u-z—G(y;z)). 


zeRKtl 


The action functional for the Moran process in the asymptotic regime 


moo, €>0, q->0, qa, oa 


is given by: for T > O and w : [0,T] > D*, 


T see a ; 
: if w is absolutely continuous 
[ Hwowe)ar | 
k 0 and the integral converges, 
Th) = 
+00 otherwise. 


Unfortunately, we are not able to obtain a large deviations principle that works in 
the whole space D*. Indeed, since the probability of a back mutation to the master 
sequence can be of order g, when the number of master sequences in the population 
is of order 1/g, the Moran model is still governed by the stochastic fluctuations of 
the number of master sequences. Thus, in order for the deterministic limit system to 
provide a useful approximation of the Moran process, it is essential that the number 
of master sequences present in the population is proportional to the size of the 
population. Moreover, problems of a more technical nature also arise when trying to 
prove the large deviations principle for trajectories that pass close to the boundary 
of D*. Let p > 0 and let us define the set 


DE = {ye D*: d(y,R**! \ D*) = p}. 


Given a point y° € Ds and 7 = 0, we denote by O%, ed) the set of the trajectories 


w in Ds whose starting point is y°, and whose action is lower than 7, that is, 
Of n(n) = {W:10,T] > DS: WO) =y° and FEW) <n}. 
Let us denote further by 7, the exit time of the Moran process from the set 2, ‘ 
: £ k 
Tp = mt {? > 02C;e Ds | 


and let C? be the Moran process stopped at the time Ty, i.e., ON = CraTy- We are 
now able to state the large deviations principle. 


Theorem 28.2 (LDP for the Moran process) We have the following bounds. 
Lower Bound. For every 6,y,p,0,T > 0 with p > 6, we have, asymptotically, 
uniformly over y° € D. andy € O, (10), 
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P(dr(C.) < {Cy = y°) = exp (—mUW) +). 


Upper bound. For every 6,y,p,n0,T > 0 with p > 6, we have, asymptotically, 
uniformly over y® € aes and n < No, 


P(ar(c?, Po (7) > 5 Cy = 9°] < exp(-—m(n-y)). 


We won't prove this theorem here. A similar result has been proven in [21]. We 
state next a number of conclusions that follow more or less directly from the above 
large deviations principle. Firstly, the large deviation principle provides us with 
an excellent tool to prove the analog of theorem 11.1 for class-dependent fitness 
landscapes, which we state next. Let y* denote the unique quasispecies distribution 
as given in theorem 22.3. For a €]0, +oo[, we define the vector Q(Ay, a) € D* by 


(y*(0),..., y*(K)) if An(O)e% > 1, 
0 


(An, a) = if Ay(O)e~* < 1. 


Finally, for a €]0, +00[, we define 


oat i _ ¥(0) = Q(An, a), y(T) = 0, 
oa) = me {io * y(t) € DE for 0 <t<T . 


Theorem 28.3 Suppose that 
€ > +00, m— +0, q- 0, 
€q > a € ]0, +00[, = ae [0,400]. 
We have the following dichotomy: 
e Ifad(a) < Ink, then 


Vk>0 lim lim E(C,(k)) = 0. 
€,m—>00, q0 t-00 


€q-a, ? a 


e Ifad(a) > Ink, then 


Vk>0 lim lim E(C,(k)) = y(k). 
€,m—>0, q0_ t-00 
qa, iv ameced 
Furthermore, in both cases 
Vk>=0 lim lim Var(C;(k)) = 0. 
€,m—>00, q>0 too 


qa, oa 
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A similar statement has already been proven for the Wright—Fisher process, also 
using a large deviations principle, but the problem remains open in the case of the 
Moran process. One of the principal difficulties in showing Theorem 28.3 does not 
lie in the effective application of the large deviations principle, but on dealing with 
populations that fall outside the set D in which the large deviations principle works. 
Indeed, for populations falling outside the set D*, a better control is needed than the 
one provided by the large deviations principle. 

Secondly, the large deviations principle also applies to the case of the sharp 
peak landscape, and can help to explain in greater detail several phenomena. As 
an example, we may consider the extinction of a quasispecies. Here by extinction 
we mean the evolution towards a population containing 0 master sequences, when 
starting from a population containing a proportion p* = (7e~* —1)/(a7— 1) of master 
sequences. We know that the time of extinction is of order e””*, but how does this 
extinction actually happen? 

Let M, = C,(0) denote the proportion of the master sequences in the Moran model 
on the sharp peak landscape. The process M; takes its values in [0,1] MN Z/m, and 
it can only perform 1|/m-sized jumps either right or left, thus, in this case, the rates 
r(v),v € V, are 


r(y) = r(eo,y) = A-y)yAO)e™, 
I(y) = r(-eo,y) = y(¢(y) - yA)e™“), 


where in this case (y) = A(O)y + 1 — y only depends on y. For ¢ > 0, we define the 
following random times: 


t, = inf{t>0:M,<e}, 
0. = sup{t<te:M,>p*-e}. 
For x € [0, 1], we define (m*(t)),+9 to be the solution of the differential equation 


m'(t) = I(m(t)) - r(m(t)), 
m(O)=x. 


The trajectory of extinction of the quasispecies is described by the trajectories m* (rt) 
as stated in the following result. 


Theorem 28.4 For all 6 > 0 and x > € > 0, we have 


lim | sup |M, —m?~*(t - 6.) 


&émex,q>0  \ @,<t<t, 


> 6| Mo = x} = 0. 


€q-a, e >a 


This theorem has been proven in [21]. We jump now to the description of the situation 
for the Wright—Fisher model in the case of class-dependent fitness landscapes. 
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28.2 The Wright-Fisher Model 


We consider the concentration process (C;,),>9 associated to the Wright—Fisher 
process as in section 3.3. We take E to be the set of the Hamming classes { 0,..., €}, 
along with a fitness function Ay satisfying the hypothesis 28.1 and the lumped 
mutation matrix My. As shown in theorem 27.2, when the population is large, the 
Wright—Fisher process tends to follow the trajectories of the associated Moran— 
Kingman model. Moreover, when f is large, g small and fq close to a, the Moran— 
Kingman model converges to its infinite version, which is given by the iterations of 
the mapping ¥., defined by 


qk-i 


(k—-i! 


Vk>0 Vee S™ Baek) = o(c)! > c(i)An(e@ 


O<i<k 


The mapping ‘¥.. is a mapping from S® to itself. Note however, that under the 
hypothesis 28.1, for any k > K, the dynamics of the coordinates 0,...,k does not 
depend on the remaining coordinates. With a slight abuse of notation, we will still 
denote the restriction of the mapping ¥., to the set D* by ¥.,. Let us denote by 
Ix(p, t) the rate function governing the large deviations of a multinomial distribution: 
for p,t € pk, 


1—|th 
1-|phi’ 


K-1 
ti 
Ik(p,t) = > t; In — + (1 — |t|,) In 
~ i=0 Pi \ : 


with the convention that 0InO0 = O1In(0/0) = 0. As for the Moran process, we 
can write down a large deviations principle for the Wright—Fisher process using 
the rate function of the multinomial distribution. However, in this case, the large 
deviations principle is less amenable to further analysis, and obtaining a result on 
the extinction trajectory of the quasispecies in the spirit of 28.4 is more complicated 
than for the Moran process. Surprisingly, generalizing theorem 11.2 to the case of 
class-dependent fitness functions has fewer technical difficulties than in the case of 
the Moran process, so rather than stating the large deviations principle, we jump 
directly to presenting the generalization of theorem 11.2. Let y* denote the unique 
quasispecies distribution as given in theorem 22.3. For a €]0, +o0[, we define the 
quantity Q(Ay, a) € D* by 
Stee (°O)-. 9K) if Ane* > 1, 
0 if Ay(O)e" < 1. 


Finally, for a €]0, +00[, we define 


v po = O(An,a), pr = 0 
_;: ‘ . PO = A> , PL=Y, 
WON ot | Dl (¥o(Pu) Pest ‘on, € DE for0<k < i. 


28.2 The Wright—Fisher Model 
Theorem 28.5 We suppose that 
€ > +00, m— +o, q- 0, 


€q > a € ]0, +00[, Fa [0,400]. 


We have the following dichotomy: 
e Ifaw(a) <link, then 


Vk >0 lim — lim E(C,(k)) = 0. 


€,m—>0, q>0 n> 
tq-a, Roa 


e Ifaw(a) > Ink, then 


Vk>0 lim —_ lim E(C,(k)) = y*(k). 


€,m—>0, q0 n> 


qa, pa 


Moreover, in both cases, 


Vk>=0 lim lim Var(C,(k)) 
€,m—0o0, q>0 N-0o 


qa, Poa 


This theorem has been proven in [24]. 
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Appendix A 
Markov Chains and classical results 


In this appendix, we first recall some classical definitions and results from the theory 
of Markov chains with finite state space. The goal is to clarify the objects involved 
in the definition of the models, and to state the fundamental general results used in 
the proofs. This material can be found in any reference book on Markov chains, for 
instance [37, 52, 68]. We also present the lumping theorem, the FKG inequality and 
Hoeffding’s inequality. The definitions and results on monotonicity, coupling and 
the FKG inequality are exposed in the books of Liggett [60] and Grimmett [43]. 


A.1 Monotonicity 


We first recall some standard definitions concerning monotonicity and coupling 
for stochastic processes. A classical reference is Liggett’s book [60], especially for 
applications to particle systems. In the next two definitions, we consider a discrete 
time Markov chain (X;,))>0 with values in a space GS. We suppose that the state space 
& is finite and that it is equipped with a partial order <. A function f : 6 — Ris 
non-decreasing if 


Yx,ye& x<y = f~)<fQ). 


Definition A.1 The Markov chain (X,,),>9 is said to be monotone if, for any non- 
decreasing function /, the function 


x€ Er E(f(X,)| Xo = x) 


is non-decreasing. 
A natural way to prove monotonicity is to construct an adequate coupling. 


Definition A.2 A coupling for the Markov chain (X;,),>0 is a family of processes 
(X*)n>0 indexed by x € &, which are all defined on the same probability space, and 
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such that, for x € &, the process (X7*)n>0 is the Markov chain (X;,)n>0 starting from 
Xo = x. The coupling is said to be monotone if 


Vx, ye& x<y => YWn2>1 XX < X}. 


If there exists a monotone coupling, then the Markov chain is monotone. 


A.2. Construction of Markov Processes 


Continuous time. The most convenient way to define a continuous time process is 
to give its infinitesimal generator. The infinitesimal generator of a Markov process 
(X;)+>0 with values in a finite state space & is the linear operator L acting on the 
functions from & to R defined as follows. For any function ¢ : &6 — R, any x € &, 


Lox) = lim =(B(6(X)1Xo = x) - 60). 


It turns out that the law of the process (X;);>0 is entirely determined by the genera- 
tor L. Therefore all the probabilistic results on the process (X;);>0 can in principle 
be derived working only with its infinitesimal generator. 

In the case where the state space of the process is finite, the situation is quite 
simple and it is possible to provide direct constructions of a process having a specific 
infinitesimal generator. These constructions are not unique, but they provide more 
insight into the dynamics. Suppose that the generator L is given by 


Vx ES Ld(x) = » c(x, y)(0() = o(x)) . 


ye& 


The evolution of a process (X;);>09 having L as infinitesimal generator can loosely 
be described as follows. Suppose that X; = x. Let 


e(z) = Di e(xy). 


ytX 
Let t be a random variable whose law is exponential with parameter c(x): 
Vs >0 P(t = s) = exp(—c(x)s). 


The process waits at x until time ¢ + 7. At time ¢ + T, it jumps to a state y # x chosen 
according to the following law: 


The same scheme is then applied starting from y. In this construction, the waiting 
times Tt and the jumps are all independent. 
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Discrete time. To build a discrete time Markov chain, we need only to define its 
transition mechanism. When the state space & is finite, this amounts to giving its 
transition matrix 


(p(x, y), x,y €&) . 


The only requirement on p is that it is a stochastic matrix, i.e., it satisfies 


Vx,yeE 0 < p(x, y) <1, 


Vxe& )) psy) = 1. 
yes 


In the sequel, we consider a discrete time Markov chain (X,);>9 with values in a 
finite state space & and with transition matrix (p(x, y))x,yes- 


A.3 Lumping 


The basic lumping result for Markov chains can be found in section 6.3 of the book of 
Kemeny and Snell [53]. Let (£),..., E,) bea partition of 6. Let f : 6 > {1,...,r} 
be the function defined by 


1 if xe 
Vxe& f(x) = 


r if xeéE#, 


The Markov chain (X;);>0 is said to be lumpable with respect to the partition 
(E,,...,E,) if, for every initial distribution po of Xo, the process (f(X;)) 59 is a 
Markov chain on { 1,...,7 } whose transition probabilities do not depend on ju. 


Theorem A.3 (Lumping theorem) A necessary and sufficient condition for the 
Markov chain (X;);>09 to be lumpable with respect to the partition (E),...,E,) is 
that, 


Wije{L.ur} VoyeR >) pied = >) p02). 


z€Ej ze; 


Suppose that this condition holds. For i,j € {1,...,r}, let us denote by pg(i, j) the 
common value of the above sums. The process (f(Xr)) ,., is then a Markov chain 
with transition matrix (pe (i, j))1<i,j<r- 


t>0 


A.4 The FKG Inequality 


We consider the product space [0,1]” equipped with the product order. Let jz be 
a probability measure on [0,1] and let us denote by ®” the product probability 
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measure on [0,1]” whose marginals are equal to yx. The Harris inequality, or the 
FKG inequality in this context, says that, for any non-decreasing functions f,g : 
[0, 1]” — R, we have 


| fg due" > rayon gdue". 
(o.1)" [o.1)" [0.1]" 


The case of Bernoulli product measures is exposed in section 2.2 of Grimmett’s 
book [43]. 


A.5 Hoeffding’s Inequality 


We state Hoeffding’s inequality for Bernoulli random variables [46]. 


Theorem A.4 Suppose that X is a random variable with law the binomial law 
Bin(n, p). We have 


Vt>0  P(|X—np|>nt) < 2exp(-2nt?). 
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